Tuesday, March 27, 2012

30k entries (I), aka computers are fast

Let's assume a document storage system with an assumed maximum working set of 30K documents. Let's say we want to assign tags to documents and search based on those tags, with an average of 10 tags per document. Using the most brain-dead algorithm available, linear scan of the document entries and string comparison on the tags, what would it take to search through those documents? Could we maintain immediate feedback?

Measuring quickly on my laptop reveals that strcmp() takes around 8ns for a long matching string and 2ns for a non-match in the first character (with first character optimization). Splitting the difference and thus not taking into account that non-matches tend to be more common than matches, let's assume 5ns to compare each tag.

 5ns /tag x 10 tags / document x 30k documents = 
               50ns / document x 30K documents = 
                                      1500K ns =  
                                       1500 µs = 1.5 ms
So an approach that takes longer than, say, 2ms to do such a search can probably be improved.

Of course, we could do something slightly less thoroughly braindead and represent tags using integer, er, tags. A simple integer comparison should be less then one nanosecond, so that would drop the time to below 300 µs. With that, we could do 3000 queries per second, or 300 queries every tenth of second (the generally accepted threshold for interactive performance).

In theory, we could actually start optimizing ever so slightly by storing lists of document ids with each tag and then simply doing set operations on the document lists stored with each tag. But we don't really have to.

Thursday, February 16, 2012

radr:10876615 Allow signed binaries on iOS

Summary: OS X 10.8 Mountain Lion has a setting to allow binaries to be installed that are signed by registered developers. Please add this feature to iOS.

Filed as 10876615

Wednesday, January 11, 2012

Objective-C language of the year 2011!

After just barely missing out 2 years in a row (first to Go and then to Python), Objective-C finally managed to snatch Tiobe's programming language of the year award for 2011. Yay!

Objective-C managed a jump of 3.91% to 6.92% in 2011, and is now only a percent or so shy of C++ (8.06%), which incidentally just got edged out by C# (8.78%). It also zoomed ahead of Python (3.22%), which squeaked by Objective-C just barely last year to take top honors, but has now fallen below PHP and Visual BASIC.

Ruby, Perl and JavaScript are in the 1-3% range.

As TIOBE doesn't appear to keep old posts around the above link is likely to go stale in about a month. Here is the table:

Position

Jan 2012

Position

Jan 2011

Delta in Position

Programming Language

Ratings

Jan 2012

Delta 

Jan 2011

Status

1

1

Same.gif

Java

17.479%

-0.29%

  A

2

2

Same.gif

C

16.976%

+1.15%

  A

3

6

Up.gifUp.gifUp.gif

C#

8.781%

+2.55%

  A

4

3

Down.gif

C++

8.063%

-0.72%

  A

5

8

Up.gifUp.gifUp.gif

Objective-C

6.919%

+3.91%

  A

6

4

Down.gifDown.gif

PHP

5.710%

-2.13%

  A

7

7

Same.gif

(Visual) Basic

4.531%

-1.34%

  A

8

5

Down.gifDown.gifDown.gif

Python

3.218%

-3.05%

  A

9

9

Same.gif

Perl

2.773%

-0.08%

  A

10

11

Up.gif

JavaScript

2.322%

+0.73%

  A

11

12

Up.gif

Delphi/Object Pascal

1.576%

+0.29%

  A

12

10

Down.gifDown.gif

Ruby

1.441%

-0.34%

  A

13

13

Same.gif

Lisp

1.111%

+0.00%

  A

14

14

Same.gif

Pascal

0.798%

-0.12%

  A

15

17

Up.gifUp.gif

Transact-SQL

0.772%

+0.01%

  A

16

24

Up.gifUp.gifUp.gifUp.gifUp.gifUp.gifUp.gifUp.gif

PL/SQL

0.709%

+0.15%

  A

17

20

Up.gifUp.gifUp.gif

Ada

0.634%

-0.05%

  B

18

39

Up.gifUp.gifUp.gifUp.gifUp.gifUp.gifUp.gifUp.gifUp.gifUp.gif

Logo

0.632%

+0.29%

  B

19

25

Up.gifUp.gifUp.gifUp.gifUp.gifUp.gif

R

0.609%

+0.07%

  B

20

21

Up.gif

Lua

0.559%

-0.08%

  B

And the graph: Tpci trends January 2012