SyncHub Blog

View Original

Keeping your data secure

At SyncHub, we have a big responsibility...you trust us with your data. So, I thought I’d take some time out and explain various modules and architectural decisions in our system, and how they impact security.

The application can be broadly thought of as two components:

  1. your cloud data, and its movement from your cloud service to your data warehouse

  2. the SyncHub application, where you can configure how you want your data moved, user access etc

Let’s explore each in more detail…

Your cloud data

This section discusses how your data moves securely from your cloud service, through our application and into your data warehouse.

Authenticating with your cloud service

This is one of the most confronting aspects of SyncHub, where we ask to connect to your cloud data. Broadly speaking, there are two mechanisms to do this - OAuth2 handshakes, and tokens.

The OAuth2 handshake

The OAuth2 protocol is a beautiful invention. It lets a user (you) authorize a third party (SyncHub) to access another application (your cloud service), without ever exposing your username or password. Approximately 80% of our cloud connections offer this, and where available we always take it.

I won’t go into details about how the protocol works - you can find plenty of tutorials online (it gets pretty nerdy) - but you’ll know that we’re using OAuth2 for our cloud connection if you get redirected to log in on your cloud service’s login page.

Once authenticated, we store your access token (and your refresh token, if provided) using a two-way salt-encrypted RijnDael algorithm. These need to be two-way because our application must pass them as-is back to your cloud service when authenticating.

A quick note on scopes: Approximately half of our cloud connections offer OAuth2 Scopes which allow us to request only read-only access to your data. This is additional peace-of-mind for our users - even though SyncHub can read data, it literally cannot make any changes to your cloud service, even if we wanted to (which we definitely do not).

Token authentication

The remaining cloud services (usually the older ones) unfortunately do not offer OAuth2 authentication. In these cases, they offer some form of token authentication. The implementations vary between cloud services, but generally consist of providing us with a username/token and/or a password/secret, which we then include in our requests to your cloud service when polling for data.

It can be quite confronting for new users to SyncHub, when they are asked to provide username/password to our application. Unfortunately there is no way around this, as it is dictated by your cloud service (feel free to send your account manager an email, asking them to implement OAuth2). If you do provide your credentials, then as with OAuth2, we store them using the RijnDael algorithm, so at least you know they’re safe with us.

Data in transit

SyncHub works by moving data from your cloud service, to our own servers. All of our connectors work via the cloud service APIs, and all of these APIs are run over HTTPS. The exact version of SSL/TLS is determined by the cloud service.

When we move your data from your cloud service to your data warehouse, it can be done entirely without your data ever landing on our own servers. I can can here, because we optionally have a caching service (which you can disable), that does potentially store your data temporarily - in the event of a disconnection mid-sync. But to clarify, if you disable the cache, and you are using your own database (see below), then your data is never at rest with us.

Also optional (and entirely user-driven), there are a couple of places in the app where we can render the results of simple queries. For example, our Preview feature renders a random sample of your data to your screen, so that you can validate the sync. In these cases, the data is also transported over HTTPS (TLS 1.2). Furthermore, the preview is passed directly from your warehouse and down to the browser - it is never held in permanent storage by our app, only in memory.

Data at rest

Data is stored in a few different places in SyncHub…

Application data

Our SyncHub application is backed by a SQL Azure database. SQL Azure databases offer encryption-at-rest out-of-the-box. We also use Azure Cloud storage for some files, including potentially some caching situations. This cloud storage is locked down for access only by our application.

Managed data warehouse

If you are using our free managed data warehouse, this is also SQL Azure, and also encrypted-at-rest.

Your warehouse is literally a completely separate database. It has its own unique admin login (which we use to populate the data store, create the tables etc), and a unique "reader" login (which we issue to you, to use in your reporting tool). There is zero chance of data-bleed between warehouses.

For avoidance of doubt - while your data warehouse is encrypted-at-rest, within the database the actual data from your cloud service is not encrypted in your warehouse. Doing so would make it impractical to report on from your reporting tools.

Bring your own database

If you are instead using your own database, then the security of this is obviously up to you. However, we do still insist on different admin & reader logins.

Which humans have access to my data?

Technically, if you are using our managed warehouse, our support team has access to your data. Though in reality this is very rarely needed, and only ever to help answer questions from customers.

If you are using our BYOD solution, we do not have access to your data.

Application security

Before we get to security, a quick reminder about what the SyncHub application’s role is:

  1. run our main web application. This allows users to specify the data they wish to sync from their cloud service, and how often they wish to do so

  2. run continuously in the background, polling your cloud service for changes and moving data to your warehouse, as specified in the first step

To drive this functionality, we have built a couple of .Net applications, which we host in Microsoft Azure. Of note:

  • back-end application is built in C#. Access to this (e.g. from the front-end application) is done using stateless bearer-tokens

  • front-end application is built using Javascript/CSS/HTML and .Net Core

  • backing database is SQL Azure

  • Redis used as a SignalR backplane

  • Azure Cloud Storage used for file storage

  • SQL Azure used for data storage

Client demarcation

SyncHub runs off a single master database, containing app-related data (such as your login, the endpoint configuration, segments, logging etc).

Within this database, we demarcate each of our clients using a distinct schema in SQL Server. This schema completely replicates our data structure for each client - for example, every client has their own Person table (first name, last name etc). Separating clients on a per-schema basis massively mitigates the potential for data-bleed between clients.

Un-fun fact: Some of the most developer-intensive features in our app are where we need to aggregate data across clients. For example, we have a real-time Dashboard in the office which shows connection performance of each cloud service. Developers must go severely and purposefully out of their way to report across client schemas.

But it gets even more secure. Each schema has a different database login, making them effectively as isolated as separate databases. Again, this structure mitigates data-bleed between client apps.

Other features

Encryption

Where two-way encryption is required, we’ve already discussed our use of the Rijndael algorithm above. Two-way encryption is required when we need to provide the unencrypted data to a third-party, such as sending tokens to your cloud service during authentication. This algorithm is very secure, and only our app has the keys to decrypt the information.

However, if we never need access to the unencrypted data, then we can take your security a step further with a one-way algorithm. In these cases, we use a salt-encrypted PBKDF2-SHA1 hash to protect your data. This means that nothing can ever view the plain unencrypted version of your data. The classic use case for this is storing the password you use to log in to SyncHub.

Two-factor authentication

Our site offers two-factor authentication to further secure and protect your personal login.

IP whitelists / Firewall

Users may restrict access to their hosted database to only specific IP address ranges. Most popular reporting tools will provide a list of their hosted IP addresses, and these are often a good starting point for you to lock down your data.

Third-party validation

For what it's worth, we've also had to comply with regulations from the Australian Tax Office (as part of our Xero Practice Manager connection), and we submit compliance validation annually.

To conclude

We know that there’s a leap of trust involved when submitting your data to third-parties like SyncHub. I hope that this article helps to alleviate any concerns a little - at the very least it shows that we take security seriously.

If there’s something you don’t understand, come and find me on Linked In, or reach out to our friendly support team through www.synchub.io.

Happy reporting!