Chunk Splitting in mongoDB

Connect with

Chunk Splitting in MongoDB, well, I tried to demonstrate, how to split chunk in MongoDB through this post. Please go through step by step and you get to know the things and this topic will be in your control.

physical chunk for Chunk Splitting in mongoDB

Chunks are not physical data:
logical grouping/partitioning
described by the metadata
when you split a chunk, no change to the actual data are performed, you are changing only the metadata that represents the real data.

2. Chunk splitting algorithm

Chunk splitting algorithm (auto method) heuristic algorithm: mongos tracks writes to chunks
~20 % of the max chunk size (12-13 mg)
use splitVector on shard primary to ask for possible split points using a possible key.
primary returns a list of split points.
update the meta data to reflect the split.
no data has moved, no data has changed.

3. Manual chunk splitting

sh.splitFind(“dbname.collectioname”, {key : ….}), sh.splitAt()
To turn autosplit off: –noAutoSplit

4. Jumbo chunks

A jumbo chunk is one that exceeds the maximum chunk size.
Appears for example when you pick a very bad shard key
Cannot be moved.

5. Manual Example for Chunk Splitting in mongoDB

Pre-splitting overview:
How to Programatically Pre-Split a GUID Based Shard Key with MongoDB stackoverflow
Caveat: Do this before you insert data
When would you want to do this?:
You know what your shard key is going to be. A know domain of data

You’ve already multiple shards up and running.

You’re about to do a bulk initial data load. You’re going to load in a lot of data onto your database.
Avoid bottlenecks .
you have a large amount of data in your cluster and very few chunks, as is the case after deploying a cluster using existing data.
you expect to add a large amount of data that would initially reside in a single chunk or shard.
example: You plan to insert a large amount of data with shard key values between 300 and 400, but all values of your shard keys are between 250 and 500 are in a single chunk.

6. example of data description

10k avg doc size
6 million of them
guid : 32 character hex string
approx 60gb of data
max chunk size: 64mb ~ we are using 32 mb
1920 chunks
16 x 16 x 16 = 4096/2 = 2048 spliting factor

Suggestions are welcome to improve this post 🙂 , and Happy learning for learning for Chunk Splitting in mongoDB.