MongoDB Sharded Cluster

MongoDB offers the possibility of horizontal scaling through sharding. Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.

For more information on MongoDB Sharding, please refer to the official documentation.

Features

Feature nameSupported
Column HashingTrueColumn level
BlockingTrueColumn level
IncrementalTrueMerge, Append
Custom dataTrue
HistoryFalse
ReSyncTrueTable level
TemplatesFalse

πŸ”§ Setup Guide

Step 1 - Add credentials from your MongoDB

  1. Add the name of the database you want to connect to.
  2. Add a connection string.

Whitelist Weld's IP pool

Requests from Weld will always come from the following IP pool:

  • 3.64.84.139
  • 3.65.119.169
  • 35.156.133.78

Make sure to whitelist all three of these IP's within your network policies, SSH gateway server or the DB itself. If any updates to the list are ever scheduled to happen, you will be contacted by Weld via email.\

Pack Mode:

Weld supports syncing data from MongoDB in the form of packed mode or unpacked mode.

Unpacked mode

Weld unpacks one layer of nested fields and infers the data types.

Unpacked mode will return your table in the following format:

{
  "id": 1,
  "type": 2,
  "nested": {
    "name": 3
  }
}

is delivered to your destination as

idtypenested
12{"name":3}

Packed Mode

selecting packed mode will return your table in the following format:

{
  "id": 1,
  "type": 2,
  "nested": {
    "name": 3
  }
}

is delivered to your destination as

iddata json
1{"id":1, "type":2, nested":{"name":3}}

Change Pack Mode

You can change pack modes in your configuration settings of your MongoDB connector.

When you change the pack mode for a table, we automatically perform a full re-sync of that table.


βš™ Configuration

By default the MongoDB connector is set to always run full syncs. To optimize the sync time and reduce processing overhead we recommend for you to set up the syncs to run incrementally.

We currently support both Merge and Append mode for incremental syncs.

Merge

To have your table running incrementally using the merge configuration you need a table primary key and a cursor timestamp (updated_at is preferred).

When a sync is run, Weld will select only the new or changed rows since the last update.

Append

If a row updated_at timestamp is not available on the table then another option is to run an incremental sync using append mode. Append mode uses cursor to keep track of how far the sync got on the last run. It will use that cursor to append new entries at the end of the table on the next run.

Append is not widely used as it does not capture updates in the previous rows.

Was this page helpful?