An Interesting Observation using POF Extractors

I have recently been doing some performance tests to see how Oracle Coherence performance degrades using the various off heap storage methods available. Part of these tests was to run various queries that returned different amounts of data. The queries were each run with and without the relevant indexes and all used a simple EqualsFilter with POF Extractor.

POF basically holds the serialized data as a byte stream. This stream is not indexed which means to find the field with a particular POF ID you need to walk down the stream looking at each field until you get to the one you want. Now, I have known for some time how POF and POF Extractors work and talked about it with various people over time but it is only when you see some real data that you see the actual effect.

Basically the higher the POF ID of the target field the longer it takes to extract as you need to walk over more fields to get to the one you want.

This was highlighted in my testing as I was originally extracting different numbers of entries in our tests using different fields. Just to be clear, I am talking about non-indexed queries here that need to search the whole cache. When we analysed the figures, as expected, the query tests that matched a lot of entries were slower than those that matched fewer entries as the serialization time of the result set would be different. But, on closer inspection the tests returning more matches were faster than they should have been. After a bit more digging we realised that the test matching more entries was querying on a field with a much lower POF ID than the test matching fewer rows.

I wrote another test that ran a simple query against different fields in a single class. The fields were all String fields and I put the same value into them all, so the serialization cost would be the same for all the tests. I loaded 20,000 entries into a cache, split across three storage nodes. The query ran from a client over an Extend connection and would match 10% of the entries. Each query ran 1000 times and I averaged the times to get a result. Here are the results.

POF ID Time
115    160 ms
127    202 ms
163    327 ms
173    331 ms

Now, as I have said, I have known that POF worked like this for some time, but seeing the results above, I was suprised at how much of a difference the location of the field actually makes to the extractor times. You can see that the query for field 173 was double the time for the query for field 115. Remember, the queries all matched the same 2000 entries of the 20000 in the cache and the value of the four fields was exactly the same so the differences should only be down to the extractor time. This was not the most scientific of experiments and times will vary depending on a lot fo things, but there will always be a difference depending on how far down the stream the field you want to extract is. We can in fact do a similar experiment with just the PofExtractor and time how long it takes to extract different fields from a large number of entries. When I did this, the figures showed pretty much the same results – as you would expect.

Obviously indexed queries are a different matter, and with an index the above queries all came back in a pretty constant 9ms for all the fields. But indexes give deminishing returns, the more indexes you have the better your queries but the slower your updates and the more storage you need. In our case we want more storage – which is why I was investigaing off-heap in the first place, so we do not want to fill our heaps with indexes, but we also need good query performance.

The obvious solution would be to put the non-indexed fields near to the front of the POF stream, not something we have done yet, but it might be an idea if you are very concerned with non-indexed query performance.

I am not saying POF is bad, far from it and it makes sense to use it in Coherence, but it is always good to be aware of exactly how something you use works and how decisions as simple as the order you serialize fields impacts the performance of your system.

You may also like...

3 Responses

  1. Thanks for the material, and the blog truly looks awesome. Just what wordpress theme are you using?

  2. jk says:

    The theme is called PageLines

  3. I’m assuming you actually had 173+ properties in this class?

    If you only had these 4 properties this is very much the same as giving them indexes 1, 2, 3 and 4, as PofReader simply reads the next property index from the stream and checks if that’s the one you want to read, which is the case in vast majority of the cases.

    That said, your conclusion still stands — there is a bit of extra work PofValue has to do to find the offset and length of each subsequent property in the stream, so accessing fourth property is more expensive than accessing third, which is more expensive than second, etc.

    However, once you access property N, offsets and lengths of all properties between 0 and N are cached, so subsequent accesses to any of them has the same cost (it is a direct buffer access at that point). This means that accessing the last property first will make future access to any property cost the same (and be quite cheap, as stream parsing is what consumes most time), which might have interesting implications on criteria ordering in composite filters.

    It would also be possible to optimize read performance (at expense of storage size) by pre-parsing the stream on entry update and storing property offset/length map as a binary decoration, but I’ll leave that as an exercise for your next blog post ;-) (keep them coming, btw, there is a lot of great stuff here)

    Cheers,
    Aleks

Leave a Reply

Your email address will not be published. Required fields are marked *