MongoDB Sharded Cluster
MongoDB offers the possibility of horizontal scaling through sharding. Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.
MongoDB
If you are looking to connect to a standard MongoDB instance, you can find the setup guide here.
For more information on MongoDB Sharding, please refer to the official documentation.
Features
Feature name | Supported | |
---|---|---|
Column Hashing | True | Column level |
Blocking | True | Column level |
Incremental | True | Merge, Append |
Custom data | True | |
History | False | |
ReSync | True | Table level |
Templates | False |
π§ Setup Guide
Step 1 - Add credentials from your MongoDB
- Add the name of the database you want to connect to.
- Add a connection string.
Whitelist Weld's IP pool
Requests from Weld will always come from the following IP pool:
3.64.84.139
3.65.119.169
35.156.133.78
Make sure to whitelist all three of these IP's within your network policies, SSH gateway server or the DB itself. If any updates to the list are ever scheduled to happen, you will be contacted by Weld via email.\
Pack Mode:
Weld supports syncing data from MongoDB in the form of packed mode or unpacked mode.
Unpacked mode
Weld unpacks one layer of nested fields and infers the data types.
Unpacked mode will return your table in the following format:
{
"id": 1,
"type": 2,
"nested": {
"name": 3
}
}
is delivered to your destination as
id | type | nested |
---|---|---|
1 | 2 | {"name":3} |
Packed Mode
selecting packed mode will return your table in the following format:
{
"id": 1,
"type": 2,
"nested": {
"name": 3
}
}
is delivered to your destination as
id | data json |
---|---|
1 | {"id":1, "type":2, nested":{"name":3}} |
Change Pack Mode
You can change pack modes in your configuration settings of your MongoDB connector.
When you change the pack mode for a table, we automatically perform a full re-sync of that table.
β Configuration
By default the MongoDB connector is set to always run full syncs. To optimize the sync time and reduce processing overhead we recommend for you to set up the syncs to run incrementally.
We currently support both Merge and Append mode for incremental syncs.
Merge
To have your table running incrementally using the merge configuration you need a table primary key and a cursor timestamp (updated_at
is preferred).
When a sync is run, Weld will select only the new or changed rows since the last update.
Append
If a row updated_at
timestamp is not available on the table then another option is to run an incremental sync using append mode. Append mode uses cursor to keep track of how far the sync got on the last run. It will use that cursor to append new entries at the end of the table on the next run.
Append is not widely used as it does not capture updates in the previous rows.