MongoDB Sharded Cluster

MongoDB offers the possibility of horizontal scaling through sharding. Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.

MongoDB

If you are looking to connect to a standard MongoDB instance, you can find the setup guide here.

For more information on MongoDB Sharding, please refer to the official documentation.

Features

Feature name	Supported
Column Hashing	True	Column level
Blocking	True	Column level
Incremental	True	Merge, Append
Custom data	True
History	False
ReSync	True	Table level
Templates	False

🔧 Setup Guide

Step 1 - Add credentials from your MongoDB

Add the name of the database you want to connect to.
Add a connection string.

Whitelist Weld's IP pool

Requests from Weld will always come from the following IP pool:

3.64.84.139
3.65.119.169
35.156.133.78

Make sure to whitelist all three of these IP's within your network policies, SSH gateway server or the DB itself. If any updates to the list are ever scheduled to happen, you will be contacted by Weld via email.\

Pack Mode:

Weld supports syncing data from MongoDB in the form of packed mode or unpacked mode.

Unpacked mode

Weld unpacks one layer of nested fields and infers the data types.

Unpacked mode will return your table in the following format:

{
  "_id": 1,
  "type": 2,
  "nested": {
    "name": 3
  }
}

is delivered to your destination as

_id	type	nested
1	2	`{"name":3}`

Packed Mode

selecting packed mode will return your table in the following format:

{
  "_id": 1,
  "type": 2,
  "nested": {
    "name": 3
  }
}

is delivered to your destination as

id	json
1	`{"_id":1, "type":2, nested":{"name":3}}`

Change Pack Mode

You can change pack modes in your configuration settings of your MongoDB connector.

When you change the pack mode for a table, we automatically perform a full re-sync of that table.

⚙ Configuration

By default, the MongoDB connector is set to run in Incremental (Merge) mode, using the _id field as the cursor. Since the _id is based on the creation date in MongoDB, selecting _id as the cursor will result that the incremental sync will only capture new entries. If you want to capture updated records as well, you can change the cursor to a timestamp field that reflects the time of the modification. It's strongly recommended to keep the _id field as the Primary key.

If you selected a custom cursor, make sure you have an index on the cursor field to improve the performance of the incremental sync.

createIndex({ my_custom_cursor: 1 });