SAP

Author

Maddy Wilson

Maddy Wilson

Maddy is a content creator for SAP Analytics Cloud. She has a passion for everything digital and enjoys finding creative ways to simplify complex concepts. When she’s not behind her computer you’ll find her enjoying live music, watching a play, or paddling with her dragon boat team.

Keep in touch

Subscribe for the latest news, updates, tips and more delivered right to your inbox.

Subscribe to updates

Category

Learning

Connect with us

The term “wrangling” may bring to mind images of cowboys and livestock. But in the world of analytics, data wrangling is the process of transforming raw data into a format that is easy to consume and analyze.

Rancher on a horse wrangling livestockData wrangling is a necessary step to ensure the highest quality insights when analyzing your business data. However, data wrangling can be both difficult and time-consuming, especially when it comes to large and complex data sets, or ones containing errors.

The Data Wrangling process in SAP Analytics Cloud helps you to enhance your data even faster by suggesting Smart Transformations and automating repetitive workflows with the power of machine learning technology.

Speed things up with samples

It’s no secret that organizations are collecting more and more data from a variety of sources. But while big data helps organizations to uncover better insights, wrangling large volumes of data can take a long time. To help speed things up we’ve introduced sampling to SAP Analytics Cloud.

Data Wrangling Sample Message Now, when you upload a large data set to the Modeler, your data will automatically be sampled for the wrangling stage.

This data sample is a subset of your full data set consisting of 2000 randomly selected rows. Sampling data helps to make the wrangling process more efficient. Any changes you make to your sample while wrangling will be automatically applied to your full data set once you publish it as a model.

Get a better view of your data

When it comes to data layouts, we’re typically used to dealing with rows and columns. As such, the default view in the Modeler is a grid view which looks like a familiar spreadsheet. This layout makes sense when you want to explore individual cells in detail, but when you’re dealing with large volumes of data, it can be tricky to get an overview of what you’re working with.

Card view is a different way of looking at large volumes of data. Switch to card view to see a summarized view of your data set.

Each card contains basic details about each column, such as:

  • Column type
  • Number of unique values (dimensions)
  • The mean value (measures)
  • Data quality indicated by a status bar

When you select a card, you’ll see even more detailed information about a particular column in the details panel.

SAP Analytics Cloud card view

In the details panel, you’re able to re-define your column type as a measure or dimension. You can add value labels to measures and specify dimension attributes such as description, properties, parent-child hierarchies, and geo-locations. Plus, you’ll be able to see if there are any data quality issues within the selected column and a visualization of the data distribution.

Improve efficiency with Smart Transformations

The way that data is collected is often not optimized for analysis. For example, if latitude and longitude are stored in a single column in your data set, you need to split the column in order to create a geolocation based on the coordinates. Geolocations are necessary if you want to create insightful and stunning maps during your analysis. In instances like these, you’ll have to transform your data.

SAP Analytics Cloud makes applying transformations easy. The machine learning technology in the platform automatically suggests Smart Transformations based on the context of your selected column(s).

Hover over the Smart Transformation and you’ll see a preview of the result on the grid.

split on modeling SAP Analytics Cloud

If the transformation doesn’t produce the result you want, you can modify the transformation formula in the Transformation Bar. Another option is to use the Transformation Bar to create your own transformations from scratch.

Available data transformations:

  • Sort
  • Delete rows or columns
  • Concatenate (combine) columns
  • Split column
  • Convert column values to uppercase, lowercase, or title case
  • Convert values to date, number, or boolean

Easily track and reverse transformations

As you continue wrangling your data, the transformations you make are tracked in your transformation log. To access this log, open the history panel.

In the history panel, you can switch between your column-specific and model-specific transformation logs. If you change your mind about a particular transformation, you can use the log to reverse it, as long as there are no dependencies based on the transformation.

transformation log SAP Analytics Cloud modeling wrangling

Validate your changes

Once you’re happy with your data it’s almost time to create your model. But first, it’s important to validate your data.

This step ensures that the transformations you’ve made to your sample data make sense when applied to the full data set. If there are any errors generated as a result of a transformation, you can continue wrangling in order to resolve the issue before creating your model.

Try Smart Data Wrangling today

Now that you’ve read an overview of the SAP Analytics Cloud smart data wrangling process, it’s time to see it in action. Check out our tutorial video and then try it out for yourself when you sign up for a 30-day trial.

SAP Analytics Cloud earns a top ranking from BARC

See how SAP Analytics Cloud performed in the world’s largest survey of Business Intelligence software users.