
Redesign of data onboarding UX
Role: Lead designer
Partners: 1 PM, 2 Engineers
Business context
This project was part of a key business initiative to make our product self-service. In the past, there has been an over-emphasis on flexibility at the detriment of user experience. The product was biased for advanced, technical users who understood the product limitations and ‘workarounds.’ Users oustide the Amperity service team were almost non-existent and required a 40-hour training to gain access to make changes to their data pipelines.
Problem areas
Not only did the current workflow require users to know and work in SQL, ironically, the most efficient way to work involved using a dummy landing location and working backwards to set up the data pipeline.

The team and I decided that design sprint would be the best way to begin this project, due to the complexity and need for multiple perspectives from experts in data quality, system requirements, and limited external users.
Design sprint
We followed the Google Design Sprint methodology to understand, design, and test a prototype in just 5 days.
SPRINT GOAL
Optimize the time and expertise it takes to onboard data.

Through our sessions with experts, we identified a few key areas that needed improvement:
1. The process is iterative. Most users have to import the data many times to correct and shape the data for use downstream.
2. Seeing the data is key. System limitations required users to go through several time-intensive steps to get their first view of the data. This drove users out of our platform to external systems that made it easier to view the shape of the data.

The raw cycle had the most inefficiencies, so we decided to focus on improving this step.
Design sprint prototype
Using SQL, users can explore their incoming data for validation.

The primary key tab calculates a field or combination of fields for the tables primary key.

Users set their data types and perform simple data cleaning operations.

“A huge step forward! Before, I had to blindly run a courier and wildcard it to see the files.”​
- User 1
"Wow, this is great! I don't have to set up a whole Databricks."​
- User 2
"That's really nice! Today you have to truncate to load the data."
- User 3
Refinements
With the momentum from the design sprint, the team and I came back together to assess what was feasible to ship as an MVP.
After some investigation, in order to allow users to query their data, significant structural changes would have to be made to the product. Instead, we opted for a static preview for the MVP.
Similarly, the primary key suggestions would take too long to calculate because the entire dataset needs to be examined.
With these changes, we opted for a simplified experience that could fit seamlessly into our existing platform.

Before
Required code to configure
Disconected workflow that depended on 2 separate objects to be created prior with no guidance.


After (mockups)
EMPTY STATE

Credential is validated on open.
File browser to select files directly from the upstream system.
Dependant fields are hidden until prerequisites are met.
Preview box provides structure and prompts the user for action.
AFTER SELECTION
Selected file determines type and sets smart default options to additional fields.
Preview of first 100 rows are available in tabular and raw formats.
Feed selection connects the feed and courier into one setup flow.

