Custom Connector
Set up a self-hosted custom HTTP connector.
Note: A custom connector must be created and maintained by you. Weld will not be able to provide support for the connector itself, as it is a self-hosted solution.
1) Follow the format with the correct Endpoints below to build a connector in your preferred language
2) Host the connector on a public endpoint
3) Supply the endpoint and a token in the Custom Connector Setup and connect
Once setup, Weld will take care of scheduling, processing, loading data into data warehouse, schema generation, error handling, alerting, data lineage, etc.
In this way you can get all the features of Weld even for APIs, where Weld has not built a standard integration yet.
Hint
- See a step-by-step guide through an example in our docs (under construction)
- See examples of connectors on our Github
Index
Endpoints
You have to implement 2 endpoints in your custom connector: /schema
and /
(root).
/schema
is called by Weld to get the schema for the connector (table names, field names, and types)
/
will be called for every table name to get the data
/schema
Request
URL: $your_url/schema // the url provided when setting up the connector in Weld
Method: GET
Headers:
authorization: "Bearer $your_token"; // the token provided when setting up the connector in Weld
Response
Note that "fields" will be auto-inferred from your JSON response if not specified. Missing fields will also be auto-inferred, so you can let Weld do some of the heavy-lifting in terms of schema generation.
{
schema: {
$name_of_table_1: {
primary_key: string, // required
fields?: [ // optional
{ name: string, type: 'string' | 'int' | ..., default: any },
...
],
},
$name_of_table_2: {
...
},
...
},
}
/ (root)
Request
URL: $your_url // the url provided when setting up the connector in Weld
Method: POST
Headers:
authorization: "Bearer $your_token"; // the token provided when setting up the connector in Weld
Weld-Mode: "data" OR "schema"; // this header is sent to indicate if the data is being requested to build the schema, or for the data sync
Body:
name: $name_of_table // name of the table to sync
state: $previous_state // last state returned from the endpoint
weldSyncStartedAt: $started_at_date // The date when Weld started the current sync. When there is no more data to sync, this value can be sent back to Weld in the state, so in the next sync you can get data only after this date.
weldSecret: $secret_value // Optional. If you set a secret in your response, you will get the latest value of it in this field.
Response
{
insert: [...], // rows to insert into destination following the schema
state: {
... // new state e.g. updated_at date of latest item synced
},
hasMore: boolean // if true Weld will call endpoint again with the updated state to get more rows
weldSecret: any // Optional. If you set a value here, Weld will store it securely in a secret store and return it in the following requests. This is useful eg. when working with OAuth, to keep track of the access and refresh tokens.
},
Google Cloud deployment example
When creating a new cloud function, make sure to toggle Trigger β Authentication β Allow unauthenticated invocations, as this is handled directly in the cloud function code.
Google Cloud function configuration exampleTypes
You can use the following types in your schema:
{
type: 'string',
type: 'long',
type: 'double',
type: 'boolean',
type: 'int',
type: 'datetime',
type: 'array', // will be inferred as a json string
type: 'object' // will be inferred as a json string
}
Note:
- Unless specified, all fields are assumed to be of type 'string'.
To add nullable types, you can use the following format:
{
type: ['null', 'string']
type: ['null', 'long'],
...
}
Custom Connector examples and resources
It can be helpful to look at other code examples when building your own custom connector - please see below π
- Custom connector example (under construction)
- Community connectors on Github