Getting started with Adobe Analytics and Alteryx

Sometimes you need your Web data raw and ready to stew.  Adobe Data Warehouse is great for that. But sometimes you need to toss other ingredients into this stew and feed the troops at every meal.  And at other times your job is to break out the silver and create a work of culinary art. Alteryx workflows and Tableau visualizations are powerful tools to have in the kitchen for more complex data assembly and mouth-watering presentation.

This post is about giving you the best of both worlds…or at least getting you started.  

Adobe Analytics provides leading-edge reporting and segmentation tools coupled with the ability to consolidate almost everything under one (Adobe) roof.  But if you haven’t built this roof yet or if you need to blend Web data into an internal report or predictive model, the Alteryx Connector to Adobe Analytics (a macro created by Taylor Cox) is a powerful way to pull data from the Adobe API to an existing workflow and Tableau dashboard.

Step 1 is simply to download the right connector…it’s the macro by Taylor Cox that looks like this:  

In my haste and excitement I ignored the very clear instructions to read the README file first.  This was big mistake. I know you’re already excited so let me give you a preview of the steps you need to follow to get the Adobe Analytics Connector up and running in no time:

  1. Make sure your Adobe User has Web Services privileges enabled
  2. Find the Username and Shared Secret on your User’s Adobe profile
  3. Download and unzip the Adobe Analytics Connector (Open in Alteryx Designer)
  4. Run the application “Adobe Analytics” in the Adobe Analytics Connector directory
  5. Enter your Adobe credentials…follow the steps…success should occur
  6. Run the application “Adobe Analytics – Library Manager” in the Adobe Analytics Connector directory
  7. You should see a list of Adobe Report Suites…add all that you need
  8. Click Finish

If everything worked as planned, when you add the Adobe Analytics Connector to an Alteryx workflow (either by inserting it as a Macro or from your toolbar) your report suites (and measures) should appear in the tool configuration.  If you don’t see your report suites in the Connector, review the steps above and (please) review the README file in detail.

So what’s next?

The limitations of the Adobe Connector are essentially those native to the Adobe API, such as restrictions on element (dimension) combinations or volume by API call (e.g. 50,000 element breakdown limit).  And the Adobe Connector makes it very easy to pull larger data sets filtered by 1-2 segments and including Elements with Classification as illustrated below:

 

 

For these reasons I have found the Adobe Connector to be most useful for importing large datasets directly into existing workflows for blended Web Analytics dashboards &/or predictive analytics.  For Data Scientists, Rsitecatalyst is the most flexible way to programatically access Adobe Analytics in R via the API.  But if you are not an R coder, or if you have already baked predictive analytics in an Alteryx Workflow and want an automated way to pull data into your model, the Adobe Analytics Connector may be a faster way to get up and running.

OK…that should be enough to get started using the Adobe Analytics Connector in Alteryx.  Take it for a test drive and share what you learn!    

 

The Cambridge Women in Data Science Conference

I was recently browsing Harvard’s Institute of Applied Computational Science website and saw there was a Women in Data Science conference. I was so excited to attend, so I set a reminder on my phone. As soon as the conference went live, I forwarded a link to all of my colleagues. Not much later, I started receiving feedback that the conference was sold out! It was really thrilling to see that there was so much interest. The conference was a great opportunity to hear how some women in data science are leveraging machine learning to transform healthcare, and advocating for open science to foster public debate of big data algorithms that are influencing society. Here are some highlights:

When Regina Barzilay, MIT Professor of Electrical Engineering and Computer Science, was a breast cancer patient at MGH, she could see how machine learning could be an approach to uncovering insights in the vast collection of patient information, including mammogram scans, pathology reports, and family history. Today, she’s in remission and collaborates with MGH to train the models to detect high-risk lesions sooner than ever imagined and their likelihood of being cancerous, reducing the number of unnecessary surgeries.

Heather Bell, who leads a digital and analytics department in biopharma, provided a big-picture talk of how various companies are using artificial intelligence to streamline the otherwise long and expensive R&D pipeline. One challenge is that it can take several months to recruit participants for clinical trials. In one example she shared, Clinithink developed a NLP platform that converts written doctor notes to structured data that can rapidly identify participants based on criteria. The platform was shown to recruit 2.5 times more participants in 5% of the time. In another example Heather provided, wearables and web applications are now proving to effectively monitor health between doctor visits. In one study, lung cancer patients responded to a brief questionnaire once a week about various health metrics like appetite and weight. The device algorithm, developed by SIVAN Innovation, generated an alert to the patients’ doctors in the case of a concerning change. Of the intervention cohort, 50% more were alive 7 months longer than the regular follow-up cohort. The trial was stopped early as the effect was so large.

Francesca Dominici, HSPH Professor of Biostatistics and Co-Director of Harvard’s Data Science Initiative, shared her powerful longitudinal study demonstrating an association between exposure to air pollution and mortality risk among all Medicaid beneficiaries (~67 million per year). As the study sparked media headlines and supports more stringent environmental policy during a time it’s hotly debated, Francesca espouses principled data science and an open science framework in which data are publicly available and results reproducible. While an inevitable concern in an open science framework is privacy, it’s worth considering Cynthia Dwork’s invention differential privacy — an effective tool that goes beyond de-anonymization to protect individuals’ identities in research databases. Coincidentally Cynthia was also a speaker at WiDS to discuss her latest endeavor of developing a metric for an algorithm that classifies people as fairly as possible.

Cynthia discussed how subjective this is so in that sense the metric must be culturally aware, which is another rationale for open science.

Rounding out an exciting day of data science, Tamara Broderick, MIT Assistant Professor of Computer Science, discussed achieving accurate Bayesian inferences with optimization, which I encourage you to watch here, as well as some of the other talks I’ve highlighted. It was inspirational to hear these accomplished women in data science presenting some of their impactful research. I am really looking forward to next year’s conference and I hope you are too.

To stay up-to-date on the Women in Data Science (WiDS), go to https://www.widscambridge.org.

 

The Case for Open Access Research

from Digi*Pub…

Today, 45 percent of scholarly research is published in some kind of Open Access format. Why is so much research being published in this format? What exactly is Open Access research and why is it important to research institutions and researchers? How have traditional journal publishers responded to Open Access? What are universities and other research institutions doing to curate and collect Open Access research? Can we rely on for-profit Open Access publishers to preserve research when their profit motives change?