MongoDB Sharded Cluster

MongoDB offers the possibility of horizontal scaling through sharding. Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.

For more information on MongoDB Sharding, please refer to the official documentation.

Features

Feature nameSupported
Column HashingTrueColumn level
BlockingTrueColumn level
IncrementalTrueMerge, Append
Custom dataTrue
HistoryFalse
ReSyncTrueTable level
TemplatesFalse

πŸ”§ Setup Guide

Step 1 - Add credentials from your MongoDB

  1. Add the name of the database you want to connect to.
  2. Add a connection string.

Whitelist Weld's IP pool

Requests from Weld will always come from the following IP pool:

  • 3.64.84.139
  • 3.65.119.169
  • 35.156.133.78

Make sure to whitelist all three of these IP's within your network policies, SSH gateway server or the DB itself. If any updates to the list are ever scheduled to happen, you will be contacted by Weld via email.\

Pack Mode:

Weld supports syncing data from MongoDB in the form of packed mode or unpacked mode.

Unpacked mode

Weld unpacks one layer of nested fields and infers the data types.

Unpacked mode will return your table in the following format:

{
  "_id": 1,
  "type": 2,
  "nested": {
    "name": 3
  }
}

is delivered to your destination as

_idtypenested
12{"name":3}

Packed Mode

selecting packed mode will return your table in the following format:

{
  "_id": 1,
  "type": 2,
  "nested": {
    "name": 3
  }
}

is delivered to your destination as

idjson
1{"_id":1, "type":2, nested":{"name":3}}

Change Pack Mode

You can change pack modes in your configuration settings of your MongoDB connector.

When you change the pack mode for a table, we automatically perform a full re-sync of that table.


βš™ Configuration

By default, the MongoDB connector is set to run in Incremental (Merge) mode, using the _id field as the cursor. Since the _id is based on the creation date in MongoDB, selecting _id as the cursor will result that the incremental sync will only capture new entries. If you want to capture updated records as well, you can change the cursor to a timestamp field that reflects the time of the modification. It's strongly recommended to keep the _id field as the Primary key.

Was this page helpful?