An Atypical Life

the life and times of Joan Touzet
  • rss
  • Home
  • Cyberroots
  • Electronics
  • Silly Stuff
    • Spork
  • Electronic Music
    • Waynemanor Studio Tracks
  • Research Portfolio
  • Joan @ U of T

CouchDB 0.9.0 bulk document post performance

2009 May 12 18:28

Based on a tip from my university colleague Chris Teplovs, I started looking at CouchDB for some analytics code I’ve been working on for my graduate studies. My experimental data set is approximately 1.9 million documents, with an average document size of 256 bytes. Documents range in size from approximately 100 to 512 bytes. (FYI, this represents about a 2x increase in size from the raw data’s original form, prior to the extraction of desired metadata.)

I struggled for a while with performance problems in initial data load, feeling unenlightened by other posts, until I cornered a few of the developers and asked them for advice. Here’s what they suggested:

  1. Use bulk insert. This is the single most important thing you can do. This reduced the initial load time from ~8 hours to under an hour, and prevents the need to compact the database.
  2. Baseline time: 42 minutes, using 1,000 documents per batch.

  3. Don’t use the default _id assigned by CouchDB. It’s just a random ID and apparently really slows down the insert operation. Instead, create your own sequence; a 10-digit sequential number was recommended. This bought me a 3x speedup and a 6x reduction in database size.
  4. Baseline time: 12 minutes, again using 1,000 documents per batch.

Using 1,000 documents per batch was a wild guess, so I decided it was time to run some tests. Using a simple shell script and GNU time, I generated the following plot of batch size vs. elapsed time:

Strange bulk insert performance under CouchDB 0.9.0

Strange bulk insert performance under CouchDB 0.9.0

The more-than-exponential growth at the right of the graph is expected; however, the peak around 3,000 documents per batch is not. I was so surprised by the results that I ran the test 3 times – and got consistent data. I’m currently running a denser set of tests between 1,000 and 6,000 documents per batch to qualify the peak a bit better.

Are there any CouchDB developers out there who can comment? You can find me on the #couchdb freenode channel as well.

Comments
5 Comments »
Categories
Development, Random, Software
Tags
couchdb
Comments rss Comments rss
Trackback Trackback

italy photos online

2009 Mar 24 15:42
The Cascata delle Marmore waterfall.

The Cascata delle Marmore waterfall.

Just like the subject says, the Italy vacation photos are online.

Comments
1 Comment »
Categories
Memories, Travel
Comments rss Comments rss
Trackback Trackback

forever anonymous

2009 Mar 24 4:33

Dragging myself to consciousness, I gasped for air. The images of  gravestones and memorials of the intellectual elite, festooned with working mainframe key punches and proofs of famous mathematical theorem in honour of their contributions to society, still lingered. I could still feel the dirt being shoveled on top of me prematurely, as I struggled to break free of my restraints. The bottoms of my lungs burned like the teenage mistake of inhaling deeply from a clove cigarette. Still, it burned less than the stinging sensation of my sub-conscious clawing through a thin layer of conceit I’d previously put up in my life to hide the twin holes of fear and shame.

What a terrible metaphor. I’ll start again.

Lately I’ve been working harder on my doctoral research, in the hopes that this may be my best chance to leave something of value behind in this world. I’ve chosen not to raise children, and judging from the superior job my friends J., D., N. and S. are doing with theirs, I made the right choice. I then look at my friends with some life calling they’ve dedicated themselves to since childhood, and wonder how different my life would have been if I only could have settled on one thing to do, rather than insisting on being a polymath.

Then I think of the reality: it’s a huge conceit to pretend that anyone will remember me at all 5 years after my death, let alone 500. The odds against that are so low that they’re unthinkable. But why the terror of it happening, if it’s so likely? And how does this square against my personal philosophy that it’s the ideas that count, not the people?

OK, so I’ll substitute idea for name, and the good feeling I would get from knowing I helped other people far into the future. It’s still a huge conceit to pretend that anyone will remember my ideas at all 5 years after I come up with them, let alone 500. So why fear the inevitable? And why put so much pressure on myself to achieve something that’s rare (and, most likely, out of my hands)?

Interestingly, I don’t fear death. I think I overcame that one many years ago when I struggled with some of the other demons in my life, and came out on top. Perhaps this is my chance to overcome this ridiculous idea as well. There was definitely a time at which I really bought into the idea that helping just one person was as good as helping a whole flotilla – and I’ve definitely helped at least one person in an immense way. What changed?

Now, after talking it out online, I’m fairly sure it was the dawning realization, though not quite ecological, that the resources I have burned (money, time and patience of those smarter than I, and yes, non-renewables like jet fuel and natural gas) are above the average. I also have been given a lot of unique and interesting opportunities, and gained a lot of special skills. It was then I decided that I had to do everything I could to put all of that good stuff back into the world, in as many ways as possible. Anything else is just conspicuous consumption. I owe it to everyone else to do as good as a job as I can to pay things back.

The thing is, my life is almost all about paying it back in one way or another. I’ve had my phases of acting spoiled, but after being forced (almost at knife-point) to volunteer my time in college, helping others has become a sort of addiction, perhaps even an unhealthy one. My job for many years now has nothing but helping other people get their jobs done more effectively. I teach, I learn (then re-teach what I learn), I research (and then teach what I find out), and I volunteer my time when I know full well I really should be taking it for myself. I helped keep a household of friends going when no others could really make ends meet. I’ve helped those in need find money for surgery, and given them the emotional strength necessary to pull through. It’s never felt like an obligation. But it’s never felt like it’s enough.

I treat that feeling of emptiness inside as telling me that I still have more work to do, more in me to give to others. Perhaps it is only a twisted redirection of guilt and shame, a hope at becoming immortal in some sense. But I don’t work hard only to see my name in lights (especially since it never happens), or predicate it only on the knowledge I’ll gain something out of it. It’s because I like doing a good job, because it does feel good to know I’ve accomplished something concrete, like presenting my first paper in several years at a conference. I’ll do it even if I’m ignored, or if someone else claims the credit for my work. (Unfortunately this sometimes makes me a poor businesswoman.) And, more importantly than anything else, I know that when I find I am doing something that causes harm to someone else, I change how I act. This is my own private version of walking softly; I have yet to figure out how to correctly carry a big stick, so I walk with my hands in my pockets instead.

And yet I still have dreams like this one tonight that wake me at 3:30 and prevent me from going back to sleep, and keep me writing uncontrollably. What am I missing? Am I acting reasonably? Should I be doing more? Less? Something different? I know I’m not the only person who has felt this way, but I also know my attention span is so poor right now that I can’t think of where to start researching to look at motivations of the great, the noble, the weak and those of the despicable monsters.I need a sanity check (and maybe a kick in the ass) so I can move forward. I refuse to sink into solipsistic musings, but a little introspection every now and again can’t hurt!

Comments
1 Comment »
Categories
Memories
Comments rss Comments rss
Trackback Trackback

thing-a-day #15: deep fried kitchen

2009 Feb 15 23:16

tonight i went overboard and deep fried things. yanno, when the oil is hot, ya gotta use it, right? my beer batter included sleeman’s cream ale, flour, and two kinds of bacon salt.

besides the fairly mundane broccoli and onions, i also deep fried local organic cheese curds. they are now my new favourite vs. mozzarella sticks.

but the real amazing item were the deep fried bounty bars, in the same batter (yes with bacon salt). these things have no right to taste this good. seriously. if you live in the US try finding bounty bars (coconut enrobed in dark chocolate), they’re in many places now. if not you will have to use the inferior mounds bars.

Comments
1 Comment »
Categories
Cooking
Tags
thing-a-day
Comments rss Comments rss
Trackback Trackback

thing-a-day #14: teaching a friend about digital performer

2009 Feb 15 23:07

today I was offline the entire day. I taught my friend dys everything i’ve learned about digital performer and my mc mix controller. verdict: good. want to play with it together and make music.

yes this is a bit lame for you to read, but it was very satisfying for me to do. i like helping other people. the end.

Comments
No Comments »
Categories
Music
Tags
thing-a-day
Comments rss Comments rss
Trackback Trackback

thing-a-day #13: eternal child

2009 Feb 14 0:35

done in one take: chick corea’s Eternal Child. Instrument: Yamaha TX816. Effects by Lexicon and MOTU.

Comments
1 Comment »
Categories
Music
Tags
thing-a-day
Comments rss Comments rss
Trackback Trackback

thing-a-day #12: virtual machine ‘09

2009 Feb 12 23:14

can’t publish this one – but i created some awesome reports in a virtual machine for my employer today. really slique. very cool. cognos-based. anything more and i’d have to kill you. ;)

Comments
1 Comment »
Categories
Random
Comments rss Comments rss
Trackback Trackback

thing-a-day #11: mac stabilization

2009 Feb 11 21:44

been fighting my mac for weeks now, with constant freezes, hangs, system-wide crashes or video corruption from 1-100 minutes after reboot. seems i followed some bad advice in the past and turned on something i shouldn’t have. so, my thing for today is sharply worded advice:

Do not enable QuartzGL (2D acceleration) on your Mac Pro. To check that QuartzGL is off, open Terminal, paste in this line and press Return:

sudo defaults write /Library/Preferences/com.apple.windowserver QuartzGLEnabled -boolean NO

Reboot to make this take effect. Voila, no more annoying crashes. You’re welcome.

Comments
2 Comments »
Categories
Development, Hardware
Comments rss Comments rss
Trackback Trackback

deleting users

2009 Feb 11 16:59

If you’ve had trouble posting on my blog since I opened up comments, you should be able to do so now. I’ve deleted all registered users – so, if you were registered before, now you shouldn’t be forced to log in just to comment.

Comments
2 Comments »
Categories
Random
Comments rss Comments rss
Trackback Trackback

thing-a-day #10: choc chip cookies

2009 Feb 10 22:18

Still burned out on music. So I made chocolate chip cookies, from this NY Times recipe. I had to make a few substitutions because of what I had on hand:

  • Whole wheat flour instead of regular flour (still used the cake flour)
  • Dark brown sugar instead of light brown sugar
  • President’s Choice chocolate chips instead of the gourmet ones suggested

Delicious!

Comments
1 Comment »
Categories
Random
Comments rss Comments rss
Trackback Trackback

« Previous Entries Next Entries »

Joan’s Plurks

plurk.com

Joan’s Twitter

This is alternative content.


follow wohali at twitter.com

Previous Posts

February 2010
M T W T F S S
« Jan «-»  
1234567
891011121314
15161718192021
22232425262728

Navigation

  • Career
  • Chat
  • Comedy
  • Construction
  • Cooking
  • Crafts
  • Development
    • Hardware
    • Software
  • Education
  • Entertainment
  • Games
  • Literature
  • Memories
  • Motorcycle
  • Music
  • Punditry
  • Random
  • Research
  • Science
  • Travel

Projects

  • Andromeda A6
  • Cherokee Font
  • Image Gallery
  • ircd mod archive
  • ircd-hybrid
  • PCE DOTB Scans
  • Raw Story
  • The Goon Show Depository
  • Voyetra 8

Blogroll

  • Sprained Soul
  • US Alert Level
  • Clare Brett
  • mengwong
  • jeremy
  • Calculated Risk
  • Making Light
  • The Mid-Century Modernist
  • Grasping Reality with Both Hands
  • Economist’s View
  • Angry Bear
  • Language Log
  • inFESTIZIO

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox