Doing Data Science: Straight Talk From the Frontline

단행본

Doing Data Science: Straight Talk From the Frontline

저자: O'Neil, Cathy | Schutt, Rachel
판사항: First edition
발행사항: Sebastopol, CA : O'Reilly, 2013
형태사항: xxiv, 375 p. : illustrations ; 23 cm
서지주기: Includes index
주제명: Big data Cyberinfrastructure Data mining Database management Information science

소장정보

위치	등록번호	청구기호 / 출력	상태	반납예정일
이용 가능 (1)
자료실	E207072		대출가능	-

이용 가능 (1)

등록번호
E207072
상태/반납예정일
대출가능
-
위치/청구기호(출력)
자료실

책 소개

Now that answering complex and compelling questions with data can make the difference in an election or a business model, data science is an attractive discipline. But how can you learn this wide-ranging, interdisciplinary field? With this book, you’ll get material from Columbia University’s "Introduction to Data Science" class in an easy-to-follow format.Each chapter-long lecture features a guest data scientist from a prominent company such as Google, Microsoft, or eBay teaching new algorithms, methods, or models by sharing case studies and actual code they use. You’ll learn what’s involved in the lives of data scientists and be able to use the techniques they present.Guest lectures focus on topics such as:Machine learning and data mining algorithms Statistical models and methods Prediction vs. description Exploratory data analysis Communication and visualization Data processing Big data Programming Ethics Asking good questions If you’re familiar with linear algebra, probability and statistics, and have some programming experience, this book will get you started with data science.Doing Data Science is collaboration between course instructor Rachel Schutt (also employed by Google) and data science consultant Cathy O’Neil (former quantitative analyst for D.E. Shaw) who attended and blogged about the course.

Preface 1. Introduction: What Is Data Science? 2. Statistical Inference, Exploratory Data Analysis, and the Data Science Process 3. Algorithms 4. Spam Filters, Naive Bayes, and Wrangling 5. Logistic Regression 6. Time Stamps and Financial Modeling 7. Extracting Meaning from Data 8. Recommendation Engines: Building a User-Facing Data Product at Scale 9. Data Visualization and Fraud Detection 10. Social Networks and Data Journalism 11. Causality 12. Epidemiology 13. Lessons Learned from Data Competitions: Data Leakage and Model Evaluation 14. Data Engineering: MapReduce, Pregel, and Hadoop 15. The Students Speak 16. Next-Generation Data Scientists, Hubris, and Ethics Index

저자 소개

저자 캐시 오닐

UC버클리를 졸업하고 1999년 하버드대학교에서 수학박사학위를 받았다. 매사추세츠공과대학교(MIT)에서 박사후과정을 거쳐 버나드 칼리지 수학과 종신교수로 재직했다. 2007년 학계를 떠나 월스트리트에서 헤지펀드 디이 쇼(D.E. Shaw)의 퀀트가 되었고 2000년대 금융계의 호황과 붕괴를 겪는다. 이후 IT 업계에서 데이터과학자로서 금융상품의 위험도와 소비자 구매 패턴 등을 예측하는 수학 모형을 개발했다.

상업, 금융, 교육 분야에서 알고리즘을 설계한 오닐은 공정하고 객관적이라고 알려진 빅데이터와 알고리즘이 사실은 편향적이며 취...

작가의 다른 작품

저자 Rachel Schutt

News Corp.의 Data Science 분야 선임 부사장이다. 컬럼비아 대학교에서 통계학 박사를 받았고 구글 연구소에서 통계학자로 수년간 일했다. 컬럼비아 대학교의 통계학과 겸임교수며, 컬럼비아에서 데이터과학공학연구소의 교육위원회의 창설멤버다. 구글에서 일한 내용을 바탕으로 여러 특허를 출원 중에 있으며, 특히 사용자 행태를 이해하는 프로토타입 알고리즘을 만들고 모형을 구축함으로써 사용자 대면제품을 만드는 데 일조하였다. NYU 수학과에서 석사학위를 받았고 스탠포드 대학교에서는 공학-경제시스템과 운용과학(OR)의 석사학위를 가...

작가의 다른 작품

알라딘에서 제공한 저자 정보입니다.상세보기

자료검색

통합검색

Doing Data Science: Straight Talk From the Frontline

소장정보

책 소개

목차

저자 소개

주제어

주제어

저자 소개