Your boss sends you a stack of CSV files and tells you that they need all the phone numbers reformatted, rows without emails removed, date formats converted, white space cleaned, and contacts changed from business representatives who have since left the company. They also vaguely ask you to identify prospects that "show the most promise". As you look at the mountain of files, you try to wrap your head around how you'll approach the problem. Should you try to brute force it with Excel functions and filters, and just repeat it for every file? Should you spend hours trying to create macros to automate the process? Or is there an easier way?
I'm here to show you the easier way to complete this task: through Dropbase, and a few of our prebuilt Python functions. Once you've set your initial functions, you'll be able to automate all these processes, saving you time and effort.
Once you've signed up for your free Dropbase account, the first step towards automating your data cleaning is to load in your CSV (or a number of other support filetypes). This can be done via a simple drag and drop interface, allowing for quick loading of your files.
Now that the data is loaded in, we'll go ahead and perform all the steps that the hypothetical boss at the beginning of the article asked for. To add a new data cleaning function, select one of the many options from the dropdown menu in the bottom left corner.
Changing Column Types
If you're going to be doing more advanced filtering or analytics to the data later on, one of the most important things that you should do is ensure that your date is in a parseable format, and is the correct column type. In this example the dates are initially seen as text fields, but we can easily convert them to a datetime field through the column dropdown menus.
Reformatting the Phone Numbers
For the sake of this example, let's say your boss wants the phone numbers to only include the numbers, and to remove all the brackets and dashes. To complete this, use the "Find and Replace" function to identify the symbols you wish to find, and then what corresponding symbols to replace them with.
As another way to do this (if you're comfortable with regular expressions), you can create a regular expression to remove any value that isn't a digit, making your formatting even more efficient. To do this, click the .* button on the Find and Replace function, and enter \D In the find column, and leave the replace column blank.
Below you can see the result of the function, with only the numbers remaining:
Rows without emails removed
To complete this, add a new "Remove Nulls" function, and select the emails column. The rows that contain a blank space for emails will now be removed, leaving you with only contacts that you have an email on file for.
Clean up whitespace
When data is entered, leading or trailing spaces can mess up formatting or future automation efforts. Conveniently, this is something that Dropbase performs automatically on importing a file into the software!
Change BDR contacts
Let's say the contact originally assigned to half of these leads is no longer with the company, and you wish to change the contact assigned to your new BDR representatives. Again, this is as easy as using the find and replace tool to identify instances of the old employees name, and replacing it with the newer employee.
Identifying the prospects with the most promise
To identify prospects with the most promise, a good way to do this is to filter the results that match your ideal criteria. For this example, let's filter out anyone whose business is making less than $50,000 in revenue, who responded "yes" to if they're satisfied with their current service provider, or who is willing to pay less than $20 per month for your service. After applying these filters, you're left with the prospects you care about the most.
Do more with your data
If you don't see the function you're looking for, you can also create a custom function using Pandas and Python. This requires a bit of knowledge of coding, but can be extremely useful if you're willing to google a little!
Now that the data is cleaned, you have two choices on how to use it. First, you could export back to a CSV format, with the data now cleaned. If your boss expects it back in the same form they gave it to you in, this would be the ideal way to export the new data.
If you however wished to create a centralized location that all the different files' information could be accessed, you could export to Database. With Dropbase, you're given a free database to store your information in, that's completely private to you.
Now that you have the flow completed for the first file, notice how each of the steps is preserved on the vertical sidebar. To apply these transformations to another file, it's as simple as uploading the next file, and then clicking "Run All Steps". With a click of a button, you now have another file that is completely cleaned based on the parameters you laid out!
Using Dropbase, your data can be processed repeatedly and saved in a central database, so next month when your boss sends you the latest leads, you can run thousands of records through the same cleaning and processing steps in seconds.
Get started with automating your workflows today.