Now that I’ve had a few weeks to digest my experience at DataScienceGO 2018, definitely one for the books. I especially enjoyed making so many connections and absorbing (or rather, writing-a-million-miles-an-hour) so much knowledge over a short weekend. As promised, I wanted to share useful resources that I learned of at DSGO. Below I’m presenting my favorites, by presenter.
Send me a note here or on LinkedIn if you have any questions or check out my Resources page now updated with some of the items below:
Sinan Ozdemir: Sinan is the Founder and CTO of Kylie.ai, a customer service AI firm that builds custom solutions that augment the customer service experience. He has also been the Head Data Science Instructor at General Assembly since 2014. * Takeaways: We’re entering a new phase of AI technology where AI reached a new level of usability allowing new companies and competitors to decrease the competitive advantage of early AI adopters. AI can increase productivity, decrease bias, and reduce administrative tasks. Europe is a leader with regard to data standards and data privacy rights.
Jorge Zuloaga: Jorge is the head of Data Science at Big Squid where he is focused on data science consulting for business applications. Big Squid also builds upon Saas platforms to automate workflow involved in training machine learning models and putting them into production. * Takeaways: We’re in a current “machine learning tsunami,” where we’re drowning in data but gaining deep insights through ML and big data on a daily basis. ML works best when provided with very clear, unambiguous problem sets. Problem formulation is critical… Always inquire about the business need or business value associated with a problem.
Paige Bailey: Paige has worked for Microsoft and Chevron as a software engineer and data scientist and is very passionate about ML. She shared some resources that were new to me and gave some tips for deploying ML models. * caret Package for R: Caret (pronounced “carrot”) has several functions that attempt to streamline the ML model building and evaluation process, feature engineering, and data splitting. I have not tried this package out, but it sounds similar to another one I have used often recently, an open-source program called WEKA that I’ve been using recently for ML model training, data splitting / folding, testing, and feature engineering. * Keras: Keras is an open source neural network library designed to enable fast experimentation with deep neural networks. It focuses on being user-friendly, modular, extensible, and there’s even an R package for it! Keras uses the TensorFlow backend engine by default. * Magenta: Magenta is a TensorFlow-based research project exploring the role of machine learning in the process of creating art and music. * Takeaway: The process for developing AI skills should include 1) building your own AI model (yes, this takes a lot of effort and repetitions), 2) modeify someone else’s model, and 3) transfer learn (reuse a model open-sourced for another purpose).
Ben Taylor: Ben is the Chief AI Officer & co-Founder at Ziff. Ben and Ziff aim to deliver the world’s best machine and deep learning to product companies with the least amount of effort. His LinkedIn is a wealth of knowledge and also entertaining… For example, using deep learning to create previously unseen images or genetic GANs (Generative Adversarial Networks). I’d certainly recommend following Ben. * scikit-learn: Simple and efficient tools for data mining and ML in Python. * Takeaway: “Get better at efficient reps, and learn how to work faster!”
Matt Dancho: Matt is the Founder and CEO of Business Science. Ben uses Business Science to help data scientists learn how to apply enterprise-grade ML in business & finance. He’s also uses R and had a few great recommendations: * H20: This R package is a scalable, open-source ML platform developed by the H20.ai * Lime: Lime is a R package that helps you explain the predictions made behind black-box models. Very helpful (and easy) tool when dealing with multivariate regressions and predictive models. * Takeaways: Executives often lose sight of the cost of employee attrition, and using ML tools and model interpretability packages like Lime help explain drivers, in order of significance, that cause employees to leave. If your ML engagement is optimization based, get approval from your clients to collect measure performance after the engagement.