Juan-Carlos Gandhi (
juan_gandhi) wrote2011-08-02 03:57 pm
stuff in memory
So we played with storing a bunch of more or less "static" data in HBase; 1.m records; reading them takes 20 minutes (since it's wiring, converting to strings, converting to maps, parsing into objects, caching the index in memory. Then for each key we'd index it (finding a range) and go to hbase to retrieve the data; if there are 10k keys in a batch, kaboom, it takes forever, for an innocent user request (which involves a third-party service to return 10k records).
So we decided to store all our data in memory, reading it from a plain csv file.
It flies, 20' for reading, merging records that can be merged, and storing in a TreeSet. I love treesets, prefix search works naturally, and the whole paraphernalia is just several hundreds of verbose java code.
Like this:
I love treesets! So nifty.
So we decided to store all our data in memory, reading it from a plain csv file.
It flies, 20' for reading, merging records that can be merged, and storing in a TreeSet. I love treesets, prefix search works naturally, and the whole paraphernalia is just several hundreds of verbose java code.
Like this:
@Override
public Map<String, Entity> locate(Iterable<String> keys, Features features) {
Map<String, Entity> result = Maps.newHashMap();
for (String key : keys) {
SortedSet<Entity> before = entitiesBefore(key);
if (!before.isEmpty()) {
Entity candidate = before.last();
if (candidate.matches(key, features)) {
result.put(key, candidate);
}
}
}
return result;
}
private SortedSet<Entity> entitiesBefore(String key) {
return content.headSet(new Entity(key));
}
I love treesets! So nifty.