Category Archives: Forward Thinkers

Create culture through teaching

When somebody comes to the analytics team with a request and says, “Hey, I’d like you to run this report,” the analytics team is expected to not just send the report but also teach the person how to run it in the future and make sure that the request doesn’t come in again.

“Building data-driven culture: An interview with ShopRunner CEO Sam Yagan,” McKinsey Quarterly. February 2019

Elevating Analytics: Jason Thompson & 33 Sticks

Jason Thompson and I share many things in common. We were born a few days apart. We recall spending too much time playing games on Apple IIe computers. We have an almost spiritual response to most of the Coen brothers’ movies. And we have been collaborating professionally for about ten years. I traveled to Utah to learn more about how 33 Sticks, the Analytics boutique he founded with Hila Dahan, inspires customers to work differently.

Jason and Hila founded 33 Sticks about seven years ago and I’ve experienced first-hand the key principles that define 33 Sticks’ competitive advantage: the company employs a handful of rock star employees who typically work (very) remotely with clients and religiously abstain from billing by the hour. Sounds simple…but speak with any 33 Sticks customer and you will hear something more…a deeper connection to the 33 Sticks team where personal stories are shared freely and relationships span multiple companies.

As I drive the 30 minutes from Salt Lake City to Jason’s home near American Fork, the landscape gently reminds me that I am traveling a world away from my home outside of Boston. Mountains are always present in this part of the country, and distance is faithfully measured relative to the Salt Lake Temple downtown. As I drive farther from Salt Lake, Google Maps struggles to finish announcing intersections before I pass through them. But then I reach a point where the mountains seem to take over, streets have names, and horses linger in open grass fields between manicured subdivisions. Tech is growing here at a torrid pace, and traffic is starting to grind along the wide roads. I arrive at Jason’s home, the closest thing to a headquarters for 33 Sticks, towards the end of the day.

Like all 33 Sticks employees, Jason works from a home office. “The Dude” hangs on one wall, painted by an artist clearly inspired by both Van Gogh and The Big Lebowski. An espresso maker sits peacefully nearby. Various inspirational business books and soccer club memorabilia are neatly arranged on a floor-to-ceiling bookshelf opposite the three bay windows that fill the home office with light. We spoke about 33 Sticks at length during my short visit, and the following is edited and excerpted from our conversation and an interview conducted remotely about a month later:

What are some of the challenges you have had to overcome to stay true to the purpose, or mission, of 33 Sticks?

Starting a business is difficult, especially if you are starting it with no money. You will be challenged to compromise your ideals for money…should I take a project just because it pays well even if it doesn’t align with what I want to do? One of our earliest, hardest decisions at 33 Sticks was to refuse to bill by the hour. And we turned down a lot of technical implementation projects because we want to help our clients solve a larger problem, or condition. Sure…we can implement Adobe but we want to help our customers navigate a more strategic journey and we feel that billing by the hour is in opposition to that goal.

Refusing to bill by the hour has been raised to a moral or ethical level at 33 Sticks…why is that? Why does it matter so much?

It’s part of our ideology…we want long term relationships with clients and we want to help them get to that next level. Billing by the hour directly competes with that goal…it rewards consultants for taking a long time to solve problems. Clients want problems solved quickly. By removing the billable hour we remove any incentive to just take our customer’s time. We’ve worked hard to educate the market in this area…and we’ve made a lot of progress…but it’s still a difficult conversation to have and there are companies who simply cannot buy our services without charging by the hour.

It’s also critical for our employees…as a manager I immediately focus on utilization as a key measure if we bill by the hour. By removing the billable hour we’re creating space for consultants to think. One of our best and most difficult decisions has been to remain faithful to our ideals since day one. Our employees can count on the fact that we will always remain true to our principles.

In taking the long term view towards client relationships, 33 Sticks clearly focuses on different measures. How do you measure success if you’re not focused on measures such as utilization?

If someone from an ivy league business school took a look at 33 Sticks they would be quick to point out that we could easily be making twice our current Revenue. There is a lot of money to be made in technical implementation…there is more money to be made by billing more hours or at higher rates. I really believe in the ideas of Yvon Chouinard, founder of Patagonia, who mastered the long term customer relationship. At 33 Sticks, our only measure is whether our decisions are in the best interest of our client’s satisfaction, and the satisfaction of our employees. Yes, we have more work to do to up front to demonstrate the premium value we offer. But once customers start working with 33 Sticks and experiencing the difference, we know we are establishing a long term customer relationship that will pay off. The work almost becomes an excuse to have these deeper, personal relationships. I’m often criticized for encouraging more personal conversations with customers or employees, such as saying it’s OK to treat them like family, to discuss vacations and share in life events. I feel that our focus on the personal, on the human challenges of walking the analytical journey, is a quality that separates 33 Sticks in the marketplace.

By renouncing the billable hour and advocating for remote work, it feels as if 33 Sticks has a social mission to change the way people work. Would you agree?

Absolutely…if we stay on the Patagonia theme and Yvon’s book “Let My People Go Surfing” the fact that Patagonia makes gear or climbing equipment is secondary to providing opportunities for social responsibility and for different experiences. We have a client who works for a 100 year-old publication, not an organization that you would associate with change, and he went to his management and asked to work with his family from Hawaii for a year. That’s the kind of personal change and new opportunity 33 Sticks wants to encourage. Analytics is just a way for us to to effect social change, to help people find different ways of doing work. 33 Sticks works with some very large brands that make billions of dollars. And sure, we can help them make millions more, but that isn’t personally fulfilling and we work with a lot of clients who feel unfulfilled. At 33 Sticks we want to lead people to a better way of doing work and by extension a better life.

The social mission of 33 Sticks is core to its success…and you’ve mentioned that technical ability is a given in the Analytics business. Do you feel that it’s possible to start a business without the technical acumen?

Absolutely…take a look at the story of Charity Water in the book ‘Thirst.’ Scott Harrison started without clue of how to bring clean drinking water to people around the Globe. He had skills as a nightclub promoter, but he had to learn all of his technical skills along the way.

What are some of the challenges that 33 Sticks faces today? And what ambitions do you have for the company?

We spend a lot of time fighting “good enough.” We could make a lot of money just doing work that is good enough for our customers. Hila has been the driving force behind the quality of 33 Sticks’ work. Early on we decided to stick to our highest standards of quality, and Hila champions this ideal with every client engagement.

My ambition for 33 Sticks is to have a larger voice in changing the culture or direction of a business. It’s about scaling our social mission of changing the way people do work to create an environment that is more fulfilling for everyone. And this relates to our passion for working remotely. I should’t dictate where you do your work…I also have no interest in chasing the changes in technology…it’s too hard to keep up with. We have done a lot of work with Adobe over the years, but we don’t want to be perceived only as an Adobe expert. We want to continue to grow our role as personal business advisers for those companies seeking more fulfilling ways of doing business.

Jason Thompson is the co-founder of 33 Sticks, a boutique Analytics consultancy. He is also an incredible chef and aspiring barista…the author can vouch for the latte Jason created with that humble espresso maker.

Women in Data Science 2019 Cambridge Conference

On March 4th I had a pleasure to attend the third annual conference for Women in Data Science in Cambridge, MA. After missing it last year (ironically, because my daughter decided to arrive a week before the conference!) and hearing so many great things about it from my colleagues, I was determined to attend it this year and excited by an impressive list of distinguished women invited to present their latest research. The one-hour delay of the start due to a mild New England snow storm only amplified my (and everyone else’s) anticipation.

The conference began with opening remarks by Cathy Chute, the Executive Director of the Institute for Applied Computational Science at Harvard. She reminded us that WiDS started at Stanford in 2015 and is now officially a global movement with events happening all around the globe. The one in Cambridge was made possible by a fruitful partnership between Harvard, MIT, and Microsoft Research New England.

Liz Langdon-Gray followed with updates about the Harvard Data Science Initiative (HDSI), which was about to celebrate a two-year anniversary. She also informed us that a highly anticipated Harvard Data Science Review, a brainchild of my former statistics professor and advisor Xiao-Li Meng, is going to launch later this spring. This inaugural publication of HDSI will be featuring “foundational thinking, research milestones, educational innovations, and major applications” in the field of data science. One of its aims is to innovate in content and presentation style and, knowing Xiao-Li’s unparalleled talent to cleverly combine deep rigor with endless entertainment, I simply cannot wait to check out the first volume of the Review when it comes out!

The first invited speaker of the conference was Cynthia Rudin, an Associate Professor of Computer Science and Electrical and Computer Engineering at Duke. Prof. Rudin started with a discussion of the concept of variable “importance” and how most methods that test for it are, usually, model-specific. However, a variable can be important for one model, but not for another. Therefore, a more interesting question to answer is whether a variable is important for any good model, or for a so-called “Rashomon set” of models.

Prof. Rudin then switched to an example that motivated her inquiry – an article on Machine Bias in ProPublica, which claimed that the proprietary “black-box” algorithm COMPAS that predicts recidivism and is used for sentencing convicts in a number of states, is racially biased. After digging deeper into the details of the ProPublica analysis and trying to fit various models to the data herself, Prof. Rudin came to a conclusion that age and criminal history were by far the most important variables in the COMPAS algorithm, not the race! Even though it is still possible to find model classes that mimic COMPAS and utilize race, this variable’s importance is probably much smaller than what was claimed in by ProPublica. Nevertheless, Prof. Rudin concluded that the “black-box” machine learning (ML) algorithm that decides person’s fate was not an ideal solution as it cannot be independently validated and might be sensitive to data errors. Instead, she advocated for the development of interpretable modeling alternatives.

We then heard from Stefanie Jegelka, an Associate Professor at MIT, who talked about tradeoffs between neural networks (NN) that are wide vs. deep. Even though theory states that an infinitely wide NN with 1-2 layer may represent any reasonable function, deep networks have shown higher accuracy results in recent classification competitions (e.g., ILSVRC). Therefore, she concluded, it was important to understand what relationships NNs could actually represent. Then Prof. Esther Duflo, a prominent economist from MIT, discussed a Double Machine Learning approach that used the power of ML apparatus to answer questions of causal nature, akin to those that, usually, require a randomized clinical trial.

Anne Jackson, a Director of Data Science and Machine Learning at Optum, was the only industry speaker at the conference. She talked about building large-scale applications in the industry settings: from data cleaning, understanding the context, to incorporating the developed model into the business process. “What we really need”, she jokingly said, “is a ‘unicorn’ – a PhD in Math, with MS in Computer Science, and a VP-level understanding of business – to get it right!”. She also cautioned against blindly relying on algorithms and, instead, always translating models into the real world. For example, comparing stakes for false-positive vs. false-negatives, considering model drift, etc. Finally, Anne touched upon the futility of efforts for building and supporting custom software. Moving away from this approach, more and more businesses start to utilize “middleware”, which is a “layer of software that connects client and back-end systems and ‘glues’ programs together”.

Finally, the last, but most certainly not least, invited speaker was Prof. Yael Grushka-Cockayne, a Visiting Professor at HBS, whose research interest revolved around behavioral decision making (among many other things). In her fun and engaging talk, she emphasized the importance of going beyond just a simple point estimation when it comes to prediction. She also reminded us of the effectiveness of crowdsourcing when it comes to forecasting, with such notable examples as The Good Judgement Project, where everyone can provide their opinion on an outcome of certain world event and get rewarded by getting it right, and the Survey or Professional Forecasters, which obtains macroeconomic predictions from a group of private-sector economists and produces quarterly reports with aggregated results. The last part of the talk was devoted to the results of Prof. Grushka-Cockayne’s successful collaboration with Heathrow Airport in applying Big Data/ML approach to improve upon passenger transfer experience, which did not sound like an easy feat! Ironically, the data which proved to be most reliable and was ultimately used in the model came from baggage transition records.

In addition to a strong lineup of featured speakers, the conference offered an excellent poster session, where students and Post Docs demonstrated their ML applications in a wide range of diverse fields, including drug development, earthquake prediction, corruption detection, and many others. All in all, this long awaited Cambridge WiDS conference most certainly exceeded my expectations and I am eagerly looking forward to the next year’s event.

The Analytics of Sustainability: Jaclyn Olsen and Caroleen Verly

On a quiet street just outside of the Square, Harvard’s Office for Sustainability occupies a decidedly green space.  The walls are literally a shade of green that hovers comfortably between lime and under-ripe avocado.  And if that becomes too perplexing, alternating blue walls (somewhere between ocean and indigo) provide visual relief.  My hosts Caroleen Verly and Jaclyn Olsen quickly explain that the colors were deliberately chosen as part of a broader mission to understand how the built environment affects health.  As I would soon learn, the Office for Sustainability views environmentalism with a wide-angle lens.

Jaclyn and Caroleen share an awe-inspiring picture of coordinated sustainability that extends well beyond the Harvard campus.  Back in 2008, the University set a campus-wide goal of reducing greenhouse gases 30% by 2016, from a 2006 baseline.  It was also the first sustainability goal that unified Harvard’s sprawling, decentralized operations towards a common objective with a clear deliverable and set of priorities. The only problem was that no one had yet agreed on what constituted a greenhouse gas or common standards of measurement.

One important role the Office for Sustainability plays is collecting and analyzing University-wide data for transparency and accountability, both internally and externally. This includes facilitating the collection and management of (large) volumes of data for participants to consume. When it came to implementing the 2006-2016 greenhouse gas reduction goal, OFS’ first step was to work with partners across campus to create a common measurement vocabulary that aligned participants in and outside of Harvard.  Let us not forget that we are talking about aggregating data from disjoint “Emissions Accounting” systems that might include building data, scope 3 emissions (e.g. Air Travel, Food) data, and procurement data.  We discuss the definition of “chicken” at length…does it only include the roasted variety?  What about chicken parmigiana?  The environmental difference between sourcing fresh vs. package meat is significant, and the challenge of creating a single definition of poultry is nothing to cluck about.

Jaclyn and Caroleen work with the Harvard and commercial communities to create new, credible measures for concepts such healthy food or greenhouse gases.  It’s an exercise of collaborating on a vision of what the ideal measurement should be, reaching consensus,  and then using this vision to assess or fit the available data into an emerging jigsaw puzzle.

So how are things going?

The ten year goal of reducing greenhouse gases by 30% was successfully achieved by 2016.  Harvard is now tackling a new set of goals striving to become fossil fuel-free by 2050, with an interim goal of becoming fossil fuel-neutral by 2026.  Never mind that  “Fossil Fuel Neutral” is a new term requiring the same level of definition that “Greenhouse Gas” needed  in 2008.  And this is only one component of Harvard’s overarching Sustainability Plan.  The Office’s work extends well beyond Harvard, providing leadership to Boston’s Green Ribbon Commission and a consortium of Higher Education in the New England area.

So how do they do it?

One of the key ingredients of successful Analytics initiatives is clear direction from the executive team.  The goal to reduce greenhouse gases by 30% came from the top, and echoed across campus.  Same for Harvard’s new goals around Fossil Fuel usage.

A second ingredient that is often overlooked is passion.  In their work on- and off-campus, Jaclyn and Caroleen refer to a shared sense of environmental purpose among participants.

A third factor is alignment between organizational and data strategy.  The data group (and hub) is designed to satisfy the strategic goals established in the Sustainability Plan.

The fourth factor is raw talent and sustained curiosity.  The Office employs expert analysts like Caroleen who are capable of the forensic work necessary to make sense of ambiguous data sources.  With a clear sense of direction, she is able to model an ideal data set and work backwards with the data at-hand to see what pieces fit and ultimately hang together with credibility.

Jaclyn Olsen is the Associate Director of the Harvard Office for Sustainability where she leads the development of new strategic initiatives and facilitates partnerships with faculty and other key University partners.

Caroleen Verly is an Analyst at the Harvard Office for Sustainability. Before joining OFS in 2013, Caroleen worked for the City of Cambridge to evaluate the feasibility of implementing a citywide curbside composting program.

Creating Serendipity with Data: Interview with Jeff Steward

The fifth floor of the Harvard Art Museums is home to the Lightbox Gallery, a minimalist space with nine LCD monitors arrayed on the wall as a single large viewing plane.  I had no idea this space existed until my host and fellow ASC member Jeff Steward invited me to experience this venue before our interview.  Used for exploring digital art on a larger scale, the Lightbox Gallery is a small area of R&D for a Museum that is also home to the Forbes Pigment Collection and world renowned conservation labs.  

We descend to the lower level where Jeff’s team explores different uses for technology in the Museum.  An original (working) Atari 2600 is on display opposite a VR station for exploring augmented reality.  Jeff picks up a small plastic kylix that was printed from a 3D data file of the real object.  He’s quick to mention that the data is a small sample of what the Museum offers to the public…and that it took about six hours to print.  We settle into a conference room, and what follows is an edited version of our interview:

Jeff what are some of the key measures for a museum of this size and what do you see as the role of data and analytics?

It’s a very large question that drives right to the business of running the Museum.  And it can be difficult to connect the value of analytics to these operational measures.  But we have an obligation to make use of the collection and the big question is whether a piece of art is earning its keep.  Managing the Cost of Ownership is big task for a museum of our size.

What does it mean to “make use” of the collection?  What are the different ways we can experience the Museum?

We have 250,000 works but at any time only about 1,700 are on display.  Art is handled by our staff in storage, experienced by museum guests, or viewed by students and faculty for research purposes.  These are just a few examples. And our visitors include people in and outside the Harvard community who may be casually interested or actual conservators.  

Wow…that’s a lot of art behind the scenes?!

Yes…yes it is!

Does the Museum collect data on these different Users, and if so how does it employ these data?

We have a ton of “log” data from our Collection Management System.  Whenever a work of art is handled in storage, or viewed by students or researchers for study we keep very detailed records of these touch-points.  That said, we haven’t fully explored all of the uses for these data and there is a real “void” in the data once a work has been installed in a gallery.  We’ve compensated for that somewhat on the Web. The Museum has a history going back at least 10 years of maintaining a Web page for every work in the collection.  We have been able to use Google Analytics to track things like Visits and Page views, and we started tracking events on the site as well…such as when someone reserves the Study Center online.  This is actually the only way to reserve the Center. So we do have some insight into how people are exploring the collection online and we use these data to influence search rankings on the site.  We can promote or demote a work in the rankings based on the historical traffic its page has received.

What’s your best story of how data was used to directly influence the Museum’s operations?

Our Conservation group met with security and gave them slips of paper and instructions to record every “touch” they witness to the collection.  Every time someone bumps into a sculpture, accidentally touches a painting or even if they notice some new paint flecks on the floor. After a while a lot of data was collected and the Conservation team used this information to decide where to hang the art.  They moved those black strips of tape you see on the floor based on the “touch” data that had been collected by security. There is a mobile by Calder in the gallery, and next to this work was sign saying “the slightest breath will set it in motion.”  As you can imagine, this work received the most marks from security and the sign was edited to change behavior and prevent damage to the art.

What are some of the ways data and analytics are changing how we experience the collection?

The pigments from the Forbes collection are a reference collection used by conservation scientists globally.  The pigments are cataloged in our database and will eventually be available to the public. One of our goals is to associate pigments with the art that actually contains them, so that once a conservator has done the analysis and found a shade of crimson in a work, for example, that association is available to the public.  So in theory one day you could find online all of the Van Gogh’s around the world that share a common pigment.

Today we make collection data available via a public API, part of what is essentially a scholarly search interface into the collection.  The data is available, but not especially accessible to a more general audience. So I am experimenting with using basic machine processing to do face detection, text detection, auto tagging and auto captioning.  It’s fascinating how wrong some of the tagging and captioning is, but you can think of the computer vision service as just another set of eyes looking at the art even if that vision is flawed. Visitors are hoping for serendipity…odd quirks in the data you don’t expect but that are super interesting.  So the thought is that computer vision is building more perspectives into the data to ultimately support non-scholarly interfaces.  The Museums have a long term commitment to collecting and cataloging the data, but we cannot support lots of interfaces into the collections.  We can support those with the interest and ability to build interfaces themselves.

Jeff Steward is the Director of Digital Infrastructure and Emerging Technology at The Harvard Art Museums in Cambridge, MA.  jeff_steward@harvard.edu.