An Algorithm for a Better Bookshelf

skeeter2020 9 hours ago

Aside from the topic, which is interesting in a nerdy, rabbit-hole way, I found it immensely calming that despite today's relentless, exhausting AI sonic boom, there are people working to optimize a 50-yr-old algorithm for doing something both mundane and very applicable. Maybe humanity is not doomed after all.

frutiger 9 hours ago

But unfortunately HN comment threads are still about AI or about other comments even when the OP is not.

dragontamer 9 hours ago

> said Guy Blelloch

Oh jeez now I have to read the rest.

More people need to read Blellochs PH.D Thesis. Vector models for data-parallel computing. It's a mind blowing way to think of parallel computation.

This is perhaps one of the best parallel programming / parallel data structures professors on the planet.

------

Awwww it's not so much about Blellochs work but I steady he's probably the guy ACM had to help explain and understand this new paper on the Bookshelf problem. Still great read though, but I was hoping for some crazy parallel programming application here.

jasonthorsness 9 hours ago

"Their new algorithm adapts to an adversary’s strategy, but on time scales that it picks randomly"

"Even though many real-world data settings are not adversarial, situations without an adversary can still sometimes involve sudden floods of data to targeted spots, she noted."

This is pretty neat. I bet this will find practical applications.

rented_mule 9 hours ago

Yeah, this seems applicable to algorithmic management of fill factor in B+ tree based databases.
troelsSteegin 8 hours ago

Are "adversaries" broadly used in algorithm design? I've not seen that before. I'm used to edge cases and trying to break things, but an "adversary", especially white box, seems different.
- mxplerin 5 hours ago
  
  Yes. There is a whole sector of algorithm design called online algorithms dedicated to studying algorithms that must make decisions without complete information. A common analysis technique proves the "competitive ratio" of an algorithm by analyzing its worst case performance against an adversary. In fact, this article was the analysis of one particular online problem. For a simple introduction, you can check out "the ski rental problem." More complex applications include things like task scheduling and gradient descent.
  Adjacent to this topic is algorithms for two-player games, like minimax, which depend on imagining an adversary that plays perfect counter moves.
  In a similar vein, in ML, there is a model called generative adversarial networks (GANs) in which 2 networks (a generator and discriminator) play a minimax game against each other, improving the capability of both models at once.
- o11c 6 hours ago
  
  It really depends on the particular group of algorithms. I'm only considering non-cryptographic algorithms here.
  As a general rule, any algorithm that involves a hash or a random/arbitrary choice has historically been based on "assume no adversary" and even now it has only advanced to "assume an incompetent adversary".
  By contrast, most tree-adjacent algorithms have always been vigilant against competent adversaries.
- dragontamer 6 hours ago
  
  Really??
  Quicksort, mergesort and heapsort are commonly analyzed with worst case / adversaries based decisions.
  I know that binary trees (especially red-black trees, AVL trees and other self balancing trees) have huge studies into adversaries picking the worse case scenario.
  And finally, error correction coding schemes / hamming distances and other data reliability (ex: CRC32 checks) have proofs based on the worst case adversary bounds.
  -------
  If anything, I'm struggling to think of a case where the adversary / worst case performance is NOT analyzed. In many cases, worst case bounds are easier to prove than average case... So I'd assume most people start with worst case analysis before moving to average case analysis
  - rented_mule 6 hours ago
    
    I think there's a distinction between worst-case and adversarial behavior.
    For some types of problems, identifying worst-case behavior is straightforward. For example, in a hash table lookup the worst-case is when all keys hash to the same value. To me, it seems like overkill to think in terms of an intelligent adversary in that case.
    But in the problem described here, the worst-case is harder to construct. Especially while exploring the solution space given that slight tweaks to the solution can significantly change the nature of the worst-case. Thinking of it as adversarial implies thinking in terms of algorithms that dynamically produce the worst-case rather than trying to just identify a static worst-case that is specific to one solution. I can imagine that approach significantly speeding up the search for more optimal solutions.
- jonstewart an hour ago
  
  They are certainly used in anything cryptographic.
  Here is a 2011 article about DOS attacks against web apps enable by hash table-based dicts: https://www.securityweek.com/hash-table-collision-attacks-co...
  djb has long advocated “crit bit trees”, ie tries: https://cr.yp.to/critbit.html

jonstewart 2 hours ago

After trying to impose a total ordering on 7? 8? 9? bookcases, I tend to think that the sorted string table is the way to go. Order all the books in a bookcase, as a partition of a book collection. As you buy more books, add them to a new bookcase. Don’t worry about total ordering at all; enjoy the kismet of putting different kinds of books in the same case, while keeping things fairly findable.