MongoDB Sharded Cluster
MongoDB offers the possibility of horizontal scaling through sharding. Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.
MongoDB
If you are looking to connect to a standard MongoDB instance, you can find the setup guide here.
For more information on MongoDB Sharding, please refer to the official documentation.
Features
Feature name | Supported | |
---|---|---|
Column Hashing | True | Column level |
Blocking | True | Column level |
Incremental | True | Merge, Append |
Custom data | True | |
History | False | |
ReSync | True | Table level |
Templates | False |
π§ Setup Guide
Step 1 - Add credentials from your MongoDB
- Add the name of the database you want to connect to.
- Add a connection string.
Whitelist Weld's IP pool
Requests from Weld will always come from the following IP pool:
3.64.84.139
3.65.119.169
35.156.133.78
Make sure to whitelist all three of these IP's within your network policies, SSH gateway server or the DB itself. If any updates to the list are ever scheduled to happen, you will be contacted by Weld via email.\
Pack Mode:
Weld supports syncing data from MongoDB in the form of packed mode or unpacked mode.
Unpacked mode
Weld unpacks one layer of nested fields and infers the data types.
Unpacked mode will return your table in the following format:
{
"_id": 1,
"type": 2,
"nested": {
"name": 3
}
}
is delivered to your destination as
_id | type | nested |
---|---|---|
1 | 2 | {"name":3} |
Packed Mode
selecting packed mode will return your table in the following format:
{
"_id": 1,
"type": 2,
"nested": {
"name": 3
}
}
is delivered to your destination as
id | json |
---|---|
1 | {"_id":1, "type":2, nested":{"name":3}} |
Change Pack Mode
You can change pack modes in your configuration settings of your MongoDB connector.
When you change the pack mode for a table, we automatically perform a full re-sync of that table.
β Configuration
By default, the MongoDB connector is set to run in Incremental (Merge) mode, using the _id
field as the cursor. Since the _id
is based on the creation date in MongoDB, selecting _id
as the cursor will result that the incremental sync will only capture new entries.
If you want to capture updated records as well, you can change the cursor to a timestamp field that reflects the time of the modification.
It's strongly recommended to keep the _id
field as the Primary key.
If you selected a custom cursor, make sure you have an index on the cursor field to improve the performance of the incremental sync.
createIndex({ my_custom_cursor: 1 })