ASF 003: Stephanie Locke interview
Introduction
Steph Locke is one of THOSE women in IT.
She was awarded an MVP prize for spending plenty of time building communities to provide platforms for people to help each other be better with data.
At the beginning of 2017, Steph launched her own consultancy helping people start doing data science.
She loves talking about data science, R language, Power BI and many other topics not only as a speaker at conferences.
This talk was recorded after SQL Saturday #645 in Manchester, UK, on Saturday 15th July 2017.
Do you want to know what is the main area of Stephanie’s interesting or in what circumstances she begin her own consultancy company? How many books she can read weekly or where has she celebrated her last birthday?
“Ask SQL Family” podcast discovers all answers for these non-technical questions for you.
Transcript
Kamil Nowinski: So, hi, Stephanie.
Stephanie Locke: Hi!
Kamil Nowinski: Thank you for accepting my invitation.
Stephanie Locke: My pleasure. I like to talk.
KN: Great! It’s our pleasure basically. We are sitting after the conference in Manchester, Security Saturday. One of the empty rooms. So I hope nobody interrupts us anymore. Let’s start at the beginning and the idea of this interview is to show you to the older people who might not know you very well and to show you not only as a professional but also as a very common person.
SL: How dare you say I’m common!? [laughs] I’m Welsh, where common is dirt. [laughs] We’re more than common! [laughs]
KN: Sorry if I use an inappropriate word. [laughs] So tell us your name and where you’re currently living.
SL: I’m Steph Locke, I live in Cardiff, Wales. I live at home with my… and I get to work from home most of the time with my husband and two dogs. So I have a great office.
KN: So how many monitors do you have at home?
SL: I actually work on just my laptop. Especially because these days, with 4K screens you can fit quite a bit on a 13″ laptop screen. And I work with clients, I work on the road, I work in hotel rooms, so I weaned myself off the dual-monitor or the really big 27″ monitor because it’s just easier if I just make do without it. It’s more convenient. Less stress.
KN: And what are you doing for a living?
SL: I started my own date science consultancy in February so since then I’ve been doing a lot of training, helping people upscale date science and associated technologies, helping clients deliver projects, and I’ve been doing a lot of conferences. Which is great because I did that before but now as I’m the boss I get to say “The conference is OK, Steph, you can have some time off after it.” [laughs]
KN: So when did you start your company?
SL: February.
KN: This year?
SL: Yes. And I haven’t managed to quit yet. So that’s good.
Damian Widera: We can say that you’re a pretty young boss as February is like 5 months ago.
SL: And my boss, she’s mean, I only got a Dell Inspiron, I have to earn more money before she allows me to have a much better laptop. [laughs]
KN: True. So how are things going in your company?
SL: Good. I was saying earlier that I have enough pipeline until January to cover my salary. So if nothing else, if for some reason nobody else wants to hire me till January, I get to continue doing my own thing for another six months. So that’s fantastic, that’s almost a whole year of being in my own little start-up. So I’m really happy.
KN: What is that project?
SL: I’m doing a range of stuff, from training to mentoring, some technical stuff, a whole blend of activities between now and January. Very cool. I like variety.
KN: What kind of tools do you use during your work? I know you’re involved in data science and you do a lot with R language, Power BI, etc. So what are the tools you most often use?
SL: R is definitely one of my biggest tools for data science. I do a bit of Python as well. I do a lot of stuff on Azure because quite often with data science, you also have to do a bit of data engineering, so you have to wrangle data into shapes but get it available on a daily, real-time basis. So I do quite a bit of the whole Lambda architecture stuff as well, and write a lot of Azure functions and stream analytics and stuff as well, and Azure ML.
KN: Basically, the browser. [laughs]
SL: Yeah, I automate them. Even right down to my presentations, I have a continuous integration and deployment process. So I write stuff for a talk. I do git commit, git push, it goes away to Travis CI, checks that all the code is right and builds and produces my slide deck, and puts it on github pages so that it’s deployed and available. I like automation.
KN: So you probably use PowerShell?
SL: Yeah, I do PowerShell too. PowerShell is a great little language. Because I do a lot with Azure, the whole PowerShell modules for that, they change quite a bit and there’s a lot of inconsistencies between them. So it can sometimes be a bit frustrating, writing PowerShell.
KN: But it’s a very powerful tool, right? Compared to the Command Line.
DW: Stephanie, do you really like PowerShell or do you must like it because you need it to do your work in Azure? What is your feeling?
SL: I really like it because I believe in being lazy, so I try to avoid doing repetitive work, or work that is low value. So I like to script away those things and PowerShell helps me do that. So any language which makes it so that I get to do more cool projects is something I’m a fan of.
DW: Yeah, I think PowerShell is going in a good direction because each UI version is getting better and better. So I can remember PowerShell in version 1 and it was… yeah it was new, it was fun, but right now you can do almost everything. And the remote stuff is pretty good right now. For example, part of my job is in SharePoint, so configuring SharePoint without PowerShell is impossible so I had to like it, so now I like it.
SL: Yeah, I think often it takes that “I have to use something” to get you over the reluctance to go through that learning curve. Cause once you’ve gone through that pain of learning the new thing, you can come to appreciate why you learnt it. Whereas before you just see a lot of the learning-curvy things: “But I’m doing fine in my day job. Why would I learn this thing that automates some of my day job? I’m doing it well enough”. Once you’ve gone through it and then you look back, you go: “Wow, did I only use to do that much a day? I’m now doing like five times that much and I get to play with these cool things now.” You think “Yeah, OK, PowerShell has done a great thing for me.” Especially more DBAs should be picking up PowerShell. And more BI people.
KN: Yeah, especially DBAs. Or accidental DBAs. [laughs] Or even sometimes developer guys because sometimes you can automate deployment processes with PowerShell.
SL: And it means you don’t have to wait for an infrastructure person to get something done. You can do it yourself.
DW: Of course if you have permissions or privileges to do so. [laughs] The PowerShell is a must I think. Do you actually like the R language because I see you do a lot of R things also, so do you like it?
SL: I really like R. It’s a great language. Base R, which is the R that has been around for a very long time, since 2001, was actually written to be backwards compatible with a language called S, and that was written in the late 70s, early 80s, to be interface to a load of Fortran and C algorithms, and to produce charts and stuff. So because this language was written so long ago, there are tons and tons of really weird ways that Base R works. Like to put an object into memory, you make an arrow with a less-than symbol and a hyphen. The reason we do that is because on the keyboards back then, there was one of these symbols when they were typing this S language. So it’s a legacy of something that was around before I was born. So some Base R code can actually support programs that were written before I was born, and they will still yield the same results today. So that’s really amazing but it means the language is crazy. At the base level. So I do a lot with data table and the whole universe of packages called the tidyverse. The tidyverse puts a really sensible veneer of code on top of R. And that makes it… it’s phenomenal for data manipulation and analysis. Like, I won’t touch SSIS anymore. The GUI is painful. But there are so many… the static pipeline is… it’s always been quite a rarity in the work I’ve done, even when I was doing a lot of BI, that R allows me to do dynamic data pipelines and has all sorts of great validation tools. Parallel loads, connectors to basically every single data source one can imagine. So it’s a fantastic tool for ETL as well, and it really makes life so much easier. And it works like PowerShell. It’s a little glue language.
KN: So far I haven’t seen any solution using R as a pipeline, as a data flow. Have you done something like that?
SL: Yeah, I’ve built a number of basically like Azure functions. You use that whole micro service approach, so you would just write little R routines for converting one data source to whatever the output is, and then you can do script scheduling or batch processing with them. The nice thing is it’s really easy to build reusable code in R, so we started then having reusable modules which do cleaning in specific ways and routines, so it’s been quite useful for doing in-memory fast ETL.
KN: So what you basically do is you write an Azure function and then inside the Azure function, you just put the R code and that’s what you do, yes? That’s what I understand.
SL: No, unfortunately at the moment R is not the supported language in Azure functions. I’ve done a lot of ETL in Azure functions with C# but I use R more in an on-param situation. Or you can push them to Azure batch computes. There is a package that allows you to really easily send jobs to Azure batch. So if you need to do ETL that has to be embarrassingly parallel, big stuff, it’s quite easy to scale out with that. I’m still hoping though that we’ll get R integrated into Azure functions. That would be awesome.
KN: Exactly, so it looks like a very new feature that could replace the current ETL processes.
SL: Yeah. The problem I found with SSIS and even Alteryx and stuff is, it kind of condenses down who can do ETL. But quite often devs need to do some ETL as well, data scientists need to do some ETL, your BAs can do some. And if you can make it so that you’re processing engine will allow any number of languages, then your devs can write a bit in C#, your data scientists can write a bit in R or Python, your BAs can use like logic apps or something. It’s chaos if you do it wrong. But that kind of more distributed ETL fabric I think just makes it so much easier for people to flex and grow what they’re doing.
DW: Yes, we have different people with different technical knowledge, using different tools. So the ETL processes can be very complicated because they can use their own favourite languages to achieve the same goal. We just want to gather and transform some data and that’s what we like to do, so yeah.
KN: During the latest interview, you have said that the language is the most interesting innovate area where the data science has been applied to. Could you explain this thought?
SL: So, language is such an important component of our lives. Most people think in… actually think in words, apparently very few people think mostly in images. So all our mental constructs is language. When we communicate ideas, whether it’s written or speech, or even body language, these are all things that we are doing to communicate. And it’s such an incredibly complex, diverse area. Like if I say: “That’s great.” or “That’s great!” or “That’s great…”
KN: Different expressions.
SL: Yeah. So being able to chuck computers at the problem of language, being able to real-time translate, and then being able to real-time translate the nuance then as well, so the tone… because how people are sarcastic in English is different to how people are sarcastic in Polish. So being able to keep up with all the different ways people speak, deciphering accents, real-time translation, and then real-time translation of emotions and stuff…
KN: I would imagine that would be the next step. That’s why we sometimes don’t understand English jokes.
SL: [laughs] I spend a lot of my time building linear or logistic regression models to predict things and it’s great. It helps banks make more money, helps people make customers more happy. The Microsoft thing of Internet of cows, where they stuck thick bits on cows and analysed how they moved so they could work out when to send an artificial bull around, improve milk yields. We’re doing all these pretty… some things are a bit dry, some things are very cool, but I think the utopia of everybody being able to understand everybody else, the Tower of Babel situation or the Douglas Adams Babel Fish, that is just the coolest area that we can be applying machine learning to. I think it’s a really laudable goal.
KN: I heard that bots are your area of interest.
SL: Bots! Yeah, I’ve become really interested in bots quite recently because I don’t like Siri and Cortana and stuff. So I can read and type much faster than I can talk, and I can talk pretty quickly. So I wanna be able to get my job done as quickly as I can, so I ready a lot, and I use a lot of Linux, so I’m used to my Command Line and PowerShell and everything. So chatbots are basically text command lines.
KN: It’s a natural thing, a first idea.
SL: Yeah, it’s a way of me interfacing with systems in a text way. It’s much quicker for me than going and phoning somebody if a chatbot can answer my question. So it’s an area I’m starting to build stuff in, so I started small, I built an FAQ bot, and my aim is to build to a bot that will allow you to say like “Show me sales by such and such,” so like the natural language Q&A stuff in Power BI, and charts returned to you and stuff. So anybody can talk to the company chatbot with maybe like a tabular module or something behind it with measures and things and be able to get analysis back. That’s my aim, that’s what I’m working towards, and I’m gonna be doing a load of learning next month towards upping my C# or my Node.js skills to be able to get closer to that.
KN: That’s an interesting idea. Do you know how many successful implementations are there across the world of this kind of bots?
SL: I have no clue. I imagine it’s probably doubling each year right now, if nothing else. The growth rate on it is really huge and Microsoft are actually making it really easy so their bot framework sits on Azure functions, which is partly why I like it so much because I really like Azure functions. So just integrates really quite well with stuff that I already started learning about last year, further sort of distributed ETL. So yeah, it’s a really big growth area.
KN: When did you start focusing on data science itself and on that area? Or maybe it was something earlier?
SL: I started about 4 years ago now. So before then, I’d done some forecasting, some RepEntity models and things. But it was always “I think we need this, so I’m gonna go ahead and build it in my spare time” kind of thing, for work. Then I got to do something a bit more serious, so I built predictive models to say who was not going to pay their loan. When were they likely to stop paying their loan. And as a result, I was able to put these together, so that we could go to the regulator to say: “Here is our rating system, and here is what we think we need to hold against the downturn, so how much money do we need to keep behind.” And that saved the company something like 40 million pounds. There were able to take that just sitting around and put it to use. And then I went to a second mortgage start-up, so I built a real-time rate-risk pricing model. So what that meant was as every application came in for a loan, everybody got a price which was appropriate for their risk balance for company’s requirements, and also took into account competitors. Real-time. That was built in R with a MariaDB back-end. [laughs]
KN: And in this place we’ll do like “beep-beep.” [laughs]
SL: Yeah, I would understand. Yeah, I made an API in R with Apache because the Azure ML wasn’t out or anything by then, and it was all on-param and had less than like a 200 ms response rate and that still did a whole load of reads and writes to the database as well. So I was pretty proud of that. I did that in less than 6 months. I was like “Yeah!”. The whole application process took something like 30 seconds so was just like “Yeah, I can do whatever I want, I can do crazy amounts of extra stuff, up to a second and nobody will notice.” Then I went to work for Mango Solutions, so I was a principal consultant for them and I did data science consultancy for them for a while. Then I went to a security start-up and helped build a real-time anomaly detection system, as people were browsing, who is going to file sharing sites in an unusual amount, or going to a new file sharing site. Or like who is… who might be sending data out of company resources. Or who has started using their computers and like, are we picking up lots more traffic to malware sites. Or worked with education customers in the UK and now they have a legal obligation to detect radicalisation, so who is going to radical sites in a way that doesn’t look just like accidentally hitting it, and firing a notification to the admins or putting an audit log in place to say this is happening. So I did a load of that, and I was looking and I was thinking: “Do I apply for Microsoft, do I do my own thing?” Microsoft never got back to me [laughs] so I thought “OK, I’ll do my own thing.” [laughs]
KN: You started talking about the last company.
SL: I actually applied to Microsoft at the same time when I was thinking about what I wanted to do, since I was coming up 30 and my decade plans for my 30s was to…
And Microsoft didn’t get back to you, but I’m not quite sure about this because you have been awarded the MVP, so basically Microsoft did get back to you.
DW: On a different level.
SL: Yeah, it’s actually the closest I’ve come with my MVP status of thinking: “Don’t you know who I am?” I was like, when I applied to Microsoft and they ignored me and my application, I didn’t even get a “Thanks, but no, thanks.” [laughs]
KN: So you just applied to Microsoft with the MVP award?
SL: Yeah! I had the MVP and I thought this lead data scientist role in their team next to the Evangelist team will be like a nice fit for me. But they never got back to me, I was like “OK.” Well, it’s worked out quite well for me. It made my choices easier. So then I launched my business and yeah. I’m quite happy with the way it worked.
KN: OK, and talking about spare time, because we mentioned it before, what about your work-life balance? Could be a little bit difficult now for you.
SL: Yeah, I tend to go through… I’m very overall balanced but not day-to-day balanced, so one day I might be on site or at a conference, then I might be home for a few days. So I’m looking forward to spending August at home, not going on site to any clients, building lots of my training materials, developing lots of stuff for kind of a Q4 programme.
KN: I saw that because you shared your calendar on the website, so August is empty.
SL: Yes, and given that September has two online conferences, potentially four in-person conferences…
KN: Including Katowice in Poland.
SL: Including Poland. And I’m looking forward to that. That’s my first time going to Poland. Yeah, so September’s really busy. So because of that I made August very quiet. So overall it’s like doing two conferences a month. I just happen to be doing four in one run. [laughs]
KN: You can spend more time on your hobby. What is your hobby?
SL: I read a lot, and I actually found an author recently who can cope with my reading speed, so there is an author called Michael Anderle and he writes these really fun books about vampires and werewolves being a product of alien intervention, so it’s a sci-fi with vampires in. And since he started writing the books in 2015, there are 17 books in the main series that he’s written and he has co-written with something like ten other authors another 30 books. There are now something like eight series in the universe. I think it’s called The Kurtherian Universe. So around these books. And now they’re kind of outputting one book a week, which is like a couple of hours’ reading for me, but he’s the only author who comes close.
KN: Crazy speed of writing. [laughs]
SL: Crazy speed of writing, which means I can do my crazy speed of reading.
KN: Maybe he has a bot to write his books.
SL: Maybe, yeah. Now he has all these authors as well working on it, so it’s really awesome. I play a lot of board games.
KN: What is your favourite?
SL: That’s a tough one. I really like my abstract games, but probably my most favourite game is Cribbage, which is a classic card game, and Star Realms, which is kind of a deck builder, so it’s very much like a resource optimisation challenge and everything. It puts my data science brain to work in games. There are some games that I don’t play very often because I’m too good at erm… There’s a game called Camel Up, where you basically place bets on how soon these camels will finish doing a race. But because the amount of moves everything can take is really constrained, because I can do all the probabilities and the conditional probabilities quite quickly in my head, I’m too good at the game and I don’t like games which I win at all the time. Something that you always win at isn’t worth doing.
KN: Basically, there are no people who want to play with you. [laughs]
SL: No! My husband would like to play the game but I’m like “Dude, I win. If I always win, that means you always lose. Why would you always want to lose?!” That’s worse than always wanting to win. I want a challenge.
DW: So the challenge could be if he took you to a casino next time you are in Poland and you could do the probability things, the data science in your head and maybe we could even win something. But I think it’s too risky.
SL: Yeah, I don’t do betting generally, just because it’s always weighted against the punter. You actually make more money just by investing in an automated tracker funds. Stock brokers, they usually, like hedge fund managers and fund managers, they get short-term gains but overall they don’t perform that much better than the stock market, whereas the automated trackers, the ones that computers pick the stocks and everything, often do a lot better. So just invest your money in an automated tracker fund like Vanguard. That’s the best way to make money, instead of betting in casinos.
DW: Yeah, betting in casinos, you never win in the long-term. You can win one or two games or two bets but at the end – you’ll always be a loser.
KN: OK, but one win is enough for me. [laughs]
DW: One big win is enough.
SL: But most people can’t stop at one win. It’s one of the biases we have: “Oh, I’ve won one game out of three, I must be able to win two games out of four or two games out of five” and then it just carries on going and every win reaffirms the value of gambling, so people just carry on, even though… they go “You’ve lost seven times. Why do you still think this is a good idea? You have less money than you started off with.” So I don’t do gambling very often.
KN: Very good. One needs to be very careful with that. It’s very addictive, I heard. I only heard that, never tried.
SL: Wanna bet?
KN: No! Maybe next time. Maybe when we stop recording. [laughs]
KN: When did you start with SQL Server?
SL: I was 20? I was doing some analytical work at confused.com, so I was doing a lot of work. I was doing web-scraping in Excel for one part of my day and then writing emails and PR stories in another part of the day. And I wanted to get more and more data to go with my spreadsheets so I got somebody to teach me some SQL and I got access to the database, and just became the horrible person who knows enough to be dangerous and leaves queries running. And then I moved in to the BI team after that to get more data, to be able to do more things. And yeah, I just keep moving sideways, I’m kind of into things that are interesting and challenging.
KN: And what hints would you give to young people who want to start on that market, using SQL Server, or using that kind of stuff?
SL: Get DreamSpark.
KN: What? [laughs]
SL: So, Microsoft have these packs of credits that students can get called DreamSpark, so they can get like Azure credits, as much as the MSDN thing, for free. So if you wanna learn, go and do.
KN: But it’s only for one month?
SL: No! Basically for the duration you are a student, you can get the DreamSpark thing. So if you’re a student, go forth and do. Projects that you can show people that you spent the time building things is so much better than going “Oh, I did a module on data basis” because usually the person who taught you, hasn’t actually written a real query in about 30 years. I’m very cynical of mainly computer science degrees and the quality of the IT person that comes out. I find that generally they believe they’re really good at computers, but having no commercial or practical experience quite often. They need breaking the belief they know a lot, especially in databases.
KN: Are you a perfectionist?
SL: Relatively so. I try to set myself limits and deadlines so that if I’m looking like I’m not gonna meet this deadline, then I have to stop trying to get it awesome and just try and get something that works. Cause I have a tendency to want to make brilliant things when good enough is usually the right answer. So I try and set myself stop points. Coping strategies of being a perfectionist. [laughs]
KN: Any strategy that gives you your goals is a good strategy.
DW: So maybe I can ask something. Because I’m really curious about how you started being a speaker. When was your first conference? Were you scared about talking to the people when the room is crowded? What was your feeling then?
SL: So my first time speaking properly was at the Cardiff User Group, back before I ran it. I was a bit nervous so what I did was I bought a bag of chocolates along and then I said “Everybody who asks a question gets a chocolate” so they weren’t listening to me, they were trying to think of questions to ask. And that just made it so much better. [laughs]
KN: And so much more fun probably.
Yeah. So a data-addicted woman and a chocolate-addicted woman.
SL: Yes. I’m a feeder. I believe “well-fed, full people – happy people.” They can’t get up to mischief and they don’t complain. So when I run conferences, I always try to make sure that lunch and everything is really good because that will make up for so much stuff. If the food’s bad then even the best speakers… their feedback gets lower. Because people look grumpy without good food.
KN: Exactly. We know that and we have that feedback from the SQLDay conference. Is that right, Damian?
DW: Yes, most of the statements are about the food or coffee, or drinks. Because we know that the content, the sessions are OK, so all the questions are mostly about “Make better food next time,” something like this. And you know, if you have 800 people than it’s a logistics problem how to make everybody happy, so the lunch is not so long, it’s one hour, so how are we going to do this, feed everybody? Maybe one more question, just about the conference. Do you like talking to people, do you like talking to people on conferences? And what do you think about networking? How important is this for you?
SL: In some ways I’m very anti-social. I love being at home, warm with a book and a nice coffee or a nice drink. But I also like being out and meeting people. For my 30th birthday, I went to a SQL conference, so SQLGrillen was William Durkin [B|T] basically throwing me a barbecue and inviting lots of my friends. So we spent a day learning, I presented on my birthday, we had drinks before that, drinks after that, then it was a medieval festival, so we spent all day enjoying ourselves.
KN: So you were celebrating your 18th birthday with the SQL family.
SL: I see and consider a lot of the SQL family my friends more than I have just like non-SQL, normal friends. So it was a great way to spend my birthday, in the company of people who are my friends. And on the networking thing, I still find it difficult to just approach people, to talk to them, but that’s where being a speaker really comes in handy. Because if you’re nervous about getting up in front and speaking in front of people, once you’ve done that, it gives everybody else a reason to come and talk to you. So then you never have to do actually the much more scary thing of going up to a group of people and going “Hi, I’m Steph! What do you do?” So yeah, speaking or volunteering, I really recommend volunteering an event, putting on a helper shirt really helps you network people and overcomes a lot of anxieties about it.
KN: Thank you very much, Stephanie. It was our pleasure.
DW: Thank you.
KN: At the end of our conversation tell us where we can find you and where people can find you.
SL: Sure. You got my blog, itsalocke.com, and at the moment I also have a site full of my presentations on lockedata.uk, and you can follow me on Twitter at SteffLocke or at LockeData.
KN: OK, thank you very much indeed.
SL: My pleasure. Thank you for having me.
DW: Thank you very much and see you soon!
SL: Yeah, bye! See you soon!
Useful links:
Steph Locke’s websites: Company | Locke Data Talks
Steph Locke’s Twitters: Private | Company
Steph Locke’s GIT repository: GitHub account
About author
You might also like
ASF 004: Mark Broadbent interview
Introduction Mark Broadbent. Very active member of #SQLFamily, loves communicate via social media. Organizer of the very first SQL Saturday in United Kingdom – in Cambridge. Holder of very prestigious
ASF 017: Alex Whittles interview
Introduction Alex Whittles is the owner and principle consultant at Purple Frog, a SQL Server Business Intelligence consultancy in the UK with multinational clients in a variety of sectors. He
ASF 025: Rob Farley interview
Introduction Rob Farley is a Microsoft Certified Master, Microsoft Certified Trainer and is a recipient of the Microsoft MVP Award for SQL Server since 2006. Rob provides consulting and training
0 Comments
No Comments Yet!
You can be first to comment this post!