7 types of job profiles that make you a Data Scientist

So yes, this post might somewhat look like a clickbait, but I promise you it is not exactly that (Well somewhat).

I recently got in question on Quora asking something on lines of what exact skills do companies look for when they are recruiting a Data Scientist? and is there a definition of Data Scientist profile? As is pretty obvious, there is no one profile, as every company is solving its own set of problems. But I tried to make a few generic job profiles that can somewhat fit JDs of different companies. I think there is way too more variety, but I had to narrow down on a set of profiles, so here is the list:

  1. The R using number-cruncher. Can run quick Group By’s and Counts on Numbers in R/Python . This profile is the coding version of Data Analyst from earlier days. Automated report generation in a more analyst-y organization is the most common location one finds this profile in.
    Tools Used : R (dataframes), SQL
    R_logo.svg 831px-SQL_ANATOMY_wiki.svg

  2. The Modeller. Deeply Mathematical mind, who can apply Bayesian/Frequentist inferences or hierarchal models. Probably I am grouping too many people into a single group here, when people analyzing drug trials, scientists modelling complex phenomena and people running autoregressive models on stocks are grouped into one. The common theme here is Mathematics forms the base of the work
    Tools Used: R is very popular, Fortran, C++ and sometimes functional languages.
    Mathematical_models_for_complex_systems
    Eigen_Silly_Professor_135x135

  3. The Data Engineer who is also a occassional Data Scientist. Take a library from here, take some code from there and make something good enough while you manage the data pipeline. Very common profile, Data Science tasks include writing programs to automate report generation in Pandas, trying out simple Machine Learning models and (now-a-days) running a pretrained Neural Network on the data
    Tools: Python toolchain, Pandas, nltk, Keras.
    Python_logo_and_wordmark.svg
    220px-Hadoop_logo.svg
    pandas

  4. The tabular ML’er (or the XGBoost specialist). Ardent Kaggler, can train multiple algorithms and stack models and optimize the heck out of them. These guys have deep expertise with running and optimizing standard algorithms like XGBoost, Ridge Regression and (now-a-days) Keras models.
    Tools: Python or R, uses XGB, Keras a lot.
    xgboost
    Keras_Logo

  5. The old style ML’er . Close to 4, but not limited to categorical models only. Very good at feature engineering. This was the only Machine Learning expertise until the newer Deep Learning profile came up.
    Tools: C++ / Python with Scikit Learn.
    Scikit-learn_logo
    dlib-logo
    mlpack

  6. Deep Learning Guy. Needs a GPU system and a well tagged dataset and needs to try out architectures and do no feature engineering. Will spend lot of time in trying arcitectures and minimal in feature engineering, but the accuracy will be insane.
    Tools: Python, Theano, Tensorflow and high level libraries like Keras.
    theano
    TensorFlowLogo

  7. The domain specialist. Knows a lot about domain, something about linear models. Codes the domain information and trains a linear algorithm on top. Includes mechanical engineers, analysts at different firms and scientists in pure/applied sciences.
    Tools: Different Specializations use different things. Matlab by Engineers, C++/Fortran and sometimes R/Python.
    r-bioconductor-training
    800px-NumericalRecipes3rdEdCover

  8. The newbie. The intern. Will evolve into whichever of the 7 categories his/her mentor belongs.

At ParallelDots, we have people of type 2,3,4,5 and 6. (and 8 if you want to join us fulltime).


ParallelDots AI APIs , are a Deep Learning powered web service by ParallelDots Inc, that can comprehend a huge amount of unstructured text and visual content to empower your products. You can check out some of our text analysis APIs and reach out to us by filling this form here or write to us at apis@paralleldots.com

Leave a Reply

Your email address will not be published. Required fields are marked *