![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
Der Mangel an qualifizierten Softwareentwicklern im deutschsprachigen Raum verscharft sich. Die effektive Zusammenarbeit in weltweit verteilten Teams ist daher ein entscheidender Wettbewerbsfaktor und Offshoring wird immer relevanter. Der Autor moechte das Thema auch kleinen und mittleren Unternehmen naher bringen und die Eintrittsbarrieren fur kostengunstige Offshore-Softwareentwicklungen reduzieren. Er zeigt, wie Unternehmen erfolgreich Offshore-Projekte umsetzen koennen: praxisnah, mit konkreten Fallstudien und Hinweisen zur Projektabwicklung. Dem Leser werden Werkzeuge vermittelt, mit denen er die Risiken in der Abwicklung von Offshore-Projekten reduzieren kann, ohne dass Kostenvorteile verloren gehen.
Big Data Analytics with Spark is a step-by-step guide for learning Spark, which is an open-source fast and general-purpose cluster computing framework for large-scale data analysis. You will learn how to use Spark for different types of big data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. In addition, this book will help you become a much sought-after Spark expert. Spark is one of the hottest Big Data technologies. The amount of data generated today by devices, applications and users is exploding. Therefore, there is a critical need for tools that can analyze large-scale data and unlock value from it. Spark is a powerful technology that meets that need. You can, for example, use Spark to perform low latency computations through the use of efficient caching and iterative algorithms; leverage the features of its shell for easy and interactive Data analysis; employ its fast batch processing and low latency features to process your real time data streams and so on. As a result, adoption of Spark is rapidly growing and is replacing Hadoop MapReduce as the technology of choice for big data analytics. This book provides an introduction to Spark and related big-data technologies. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, and MLlib. Big Data Analytics with Spark is therefore written for busy professionals who prefer learning a new technology from a consolidated source instead of spending countless hours on the Internet trying to pick bits and pieces from different sources. The book also provides a chapter on Scala, the hottest functional programming language, and the program that underlies Spark. You'll learn the basics of functional programming in Scala, so that you can write Spark applications in it. What's more, Big Data Analytics with Spark provides an introduction to other big data technologies that are commonly used along with Spark, like Hive, Avro, Kafka and so on. So the book is self-sufficient; all the technologies that you need to know to use Spark are covered. The only thing that you are expected to know is programming in any language. There is a critical shortage of people with big data expertise, so companies are willing to pay top dollar for people with skills in areas like Spark and Scala. So reading this book and absorbing its principles will provide a boost-possibly a big boost-to your career.
This book contains extended and revised versions of a set of selected papers from two events organized by the Euro Working Group on Decision Support Systems (EWG-DSS), which were held in Toulouse, France and Barcelona, Spain, in June and July 2014. Overall, 8 papers were accepted for publication in this edition after a rigorous review process through at least three internationally known experts from the EWG-DSS Program Committee and external invited reviewers. The selected papers focus on knowledge management and sharing, and on information models developed to support various decision processes.
This book constitutes the proceedings of the Workshops held at the International Conference on Social Informatics, SocInfo 2014, which took place in Barcelona, Spain, in November 2014. This year SocInfo 2014 included nine satellite workshops: the City Labs Workshop, the Workshop on Criminal Network Analysis and Mining, CRIMENET, the Workshop on Interaction and Exchange in Social Media, DYAD, the Workshop on Exploration of Games and Gamers, EGG, the Workshop on HistoInformatics, the Workshop on Socio-Economic Dynamics, Networks and Agent-based Models, SEDNAM, the Workshop on Social Influence, SI, the Workshop on Social Scientists Working with Start-Ups and the Workshop on Social Media in Crowdsourcing and Human Computation, SoHuman.
Data assimilation is a hugely important mathematical technique, relevant in fields as diverse as geophysics, data science, and neuroscience. This modern book provides an authoritative treatment of the field as it relates to several scientific disciplines, with a particular emphasis on recent developments from machine learning and its role in the optimisation of data assimilation. Underlying theory from statistical physics, such as path integrals and Monte Carlo methods, are developed in the text as a basis for data assimilation, and the author then explores examples from current multidisciplinary research such as the modelling of shallow water systems, ocean dynamics, and neuronal dynamics in the avian brain. The theory of data assimilation and machine learning is introduced in an accessible and unified manner, and the book is suitable for undergraduate and graduate students from science and engineering without specialized experience of statistical physics.
This book constitutes the thoroughly refereed post conference proceedings of the First and Second International Workshops on In Memory Data Management and Analysis held in Riva del Garda, Italy, August 2013 and Hangzhou, China, in September 2014. The 11 revised full papers were carefully reviewed and selected from 18 submissions and cover topics from main-memory graph analytics platforms to main-memory OLTP applications.
The two-volume set LNCS 9014 and LNCS 9015 constitutes the refereed proceedings of the 12th International Conference on Theory of Cryptography, TCC 2015, held in Warsaw, Poland in March 2015. The 52 revised full papers presented were carefully reviewed and selected from 137 submissions. The papers are organized in topical sections on foundations, symmetric key, multiparty computation, concurrent and resettable security, non-malleable codes and tampering, privacy amplification, encryption an key exchange, pseudorandom functions and applications, proofs and verifiable computation, differential privacy, functional encryption, obfuscation.
The two-volume set LNCS 9014 and LNCS 9015 constitutes the refereed proceedings of the 12th International Conference on Theory of Cryptography, TCC 2015, held in Warsaw, Poland in March 2015. The 52 revised full papers presented were carefully reviewed and selected from 137 submissions. The papers are organized in topical sections on foundations, symmetric key, multiparty computation, concurrent and resettable security, non-malleable codes and tampering, privacy amplification, encryption an key exchange, pseudorandom functions and applications, proofs and verifiable computation, differential privacy, functional encryption, obfuscation.
In dem Buch werden Methoden vorgestellt, mit denen ubersehenes IT-Potenzial in Organisation genutzt werden kann. Dabei geht die Autorin davon aus, dass das Wissen bereits vorhanden ist und nur gehoben werden muss. Mit Checklisten und Tipps fur die Umsetzung."
Data Scientists at Work is a collection of interviews with sixteen of the world's most influential and innovative data scientists from across the spectrum of this hot new profession. "Data scientist is the sexiest job in the 21st century," according to the Harvard Business Review. By 2018, the United States will experience a shortage of 190,000 skilled data scientists, according to a McKinsey report. Through incisive in-depth interviews, this book mines the what, how, and why of the practice of data science from the stories, ideas, shop talk, and forecasts of its preeminent practitioners across diverse industries: social network (Yann LeCun, Facebook); professional network (Daniel Tunkelang, LinkedIn); venture capital (Roger Ehrenberg, IA Ventures); enterprise cloud computing and neuroscience (Eric Jonas, formerly Salesforce.com); newspaper and media (Chris Wiggins, The New York Times); streaming television (Caitlin Smallwood, Netflix); music forecast (Victor Hu, Next Big Sound); strategic intelligence (Amy Heineike, Quid); environmental big data (Andre Karpis ts enko, Planet OS); geospatial marketing intelligence (Jonathan Lenaghan, PlaceIQ); advertising (Claudia Perlich, Dstillery); fashion e-commerce (Anna Smith, Rent the Runway); specialty retail (Erin Shellman, Nordstrom); email marketing (John Foreman, MailChimp); predictive sales intelligence (Kira Radinsky, SalesPredict); and humanitarian nonprofit (Jake Porway, DataKind). The book features a stimulating foreword by Google's Director of Research, Peter Norvig. Each of these data scientists shares how he or she tailors the torrent-taming techniques of big data, data visualization, search, and statistics to specific jobs by dint of ingenuity, imagination, patience, and passion. Data Scientists at Work parts the curtain on the interviewees' earliest data projects, how they became data scientists, their discoveries and surprises in working with data, their thoughts on the past, present, and future of the profession, their experiences of team collaboration within their organizations, and the insights they have gained as they get their hands dirty refining mountains of raw data into objects of commercial, scientific, and educational value for their organizations and clients.
This book constitutes revised selected papers from the second ECML PKDD Workshop on Data Analytics for Renewable Energy Integration, DARE 2014, held in Nancy, France, in September 2014. The 11 papers presented in this volume were carefully reviewed and selected for inclusion in this book.
Many applications depend on the effective acquisition of semantic metadata, and this state-of-the-art volume provides extensive coverage of the field of semantics acquisition games (SAGs). SAGs are a part of the crowdsourcing approach family and the authors analyze their role as tools for acquisition of resource metadata and domain models. Three case studies of SAG-based semantics acquisition methods are shown, along with other existing SAGs: 1. the Little Search Game - a search query formulation game using negative search, serving for acquisition of lightweight semantics. 2. the PexAce - a card game acquiring annotations to images. 3. the CityLights - a SAG used for validation of music metadata. The authors also look at the SAGs from their design perspectives covering SAG design issues and existing patterns, including several novel patterns. For solving cold start problems, a "helper artifact" scheme is presented, and for dealing with malicious player behavior, a posteriori cheating detection scheme is given. The book also presents methods for assessing information about player expertise, which can be used to make SAGs more effective in terms of useful output.
Aus ihrer Entwicklung umgibt die Verwaltungssprache eine sprachliche Normierung im Hinblick einer Allgemeinverbindlichkeit gegenuber den Adressatinnen bzw. Adressaten, wobei deren historische Kodifikation sowohl in Woerterbuchern als auch in sonstigen Aufzeichnungen niedergeschrieben wurde. Dies betrifft auch die verbindliche Einhaltung der Gendergerechten Formulierungen in der oesterreichischen Verwaltungssprache: Durch Umformulieren des Satzes soll die bzw. der Handelnde eindeutig in den Prufberichten benannt werden. Diese Arbeit zeigt, inwieweit im Hinblick einer optimalen Verstandlichkeit und Lesbarkeit der Verwaltungssprache und deren Texte fur die Adressatinnen bzw. Adressaten diese Ziele mithilfe einer EDV-Unterstutzungshilfe zu erreichen sind. Zusatzliches Infomaterial ist dem Buch auf einer CD beigefugt.
This textbook grew out of notes for the ECE143 Programming for Data Analysis class that the author has been teaching at University of California, San Diego, which is a requirement for both graduate and undergraduate degrees in Machine Learning and Data Science. This book is ideal for readers with some Python programming experience. The book covers key language concepts that must be understood to program effectively, especially for data analysis applications. Certain low-level language features are discussed in detail, especially Python memory management and data structures. Using Python effectively means taking advantage of its vast ecosystem. The book discusses Python package management and how to use third-party modules as well as how to structure your own Python modules. The section on object-oriented programming explains features of the language that facilitate common programming patterns. After developing the key Python language features, the book moves on to third-party modules that are foundational for effective data analysis, starting with Numpy. The book develops key Numpy concepts and discusses internal Numpy array data structures and memory usage. Then, the author moves onto Pandas and details its many features for data processing and alignment. Because strong visualizations are important for communicating data analysis, key modules such as Matplotlib are developed in detail, along with web-based options such as Bokeh, Holoviews, Altair, and Plotly. The text is sprinkled with many tricks-of-the-trade that help avoid common pitfalls. The author explains the internal logic embodied in the Python language so that readers can get into the Python mindset and make better design choices in their codes, which is especially helpful for newcomers to both Python and data analysis. To get the most out of this book, open a Python interpreter and type along with the many code samples.
Este libro forma parte del proyecto Transformacion funcional de la literatura infantil y juvenil en la sociedad multimedia. Aplicacion de un modelo teorico de critica a las adaptaciones audiovisuales en espanol de las obras infantiles inglesas y alemanas y tiene un doble objetivo: por una parte, analizar como se adaptaron obras de literatura inglesa y alemana al medio audiovisual y como los filmes ingleses y alemanes se trasvasaron al espanol peninsular y, por otra, estudiar la calidad de los libros infantiles - y de sus traducciones al espanol -, que surgen a partir de estos productos audiovisuales. El analisis de las adaptaciones audiovisuales incluye tanto criterios tecnicos como traductologicos, y el estudio de los libros derivados se lleva a cabo siguiendo criterios literarios y traductologicos, en el caso de los analisis de las traducciones de estos productos.
Big Data Analytics Using Splunk is a hands-on book showing how to process and derive business value from big data in real time. Examples in the book draw from social media sources such as Twitter (tweets) and Foursquare (check-ins). You also learn to draw from machine data, enabling you to analyze, say, web server log files and patterns of user access in real time, as the access is occurring. Gone are the days when you need be caught out by shifting public opinion or sudden changes in customer behavior. Splunk's easy to use engine helps you recognize and react in real time, as events are occurring. Splunk is a powerful, yet simple analytical tool fast gaining traction in the fields of big data and operational intelligence. Using Splunk, you can monitor data in real time, or mine your data after the fact. Splunk's stunning visualizations aid in locating the needle of value in a haystack of a data. Geolocation support spreads your data across a map, allowing you to drill down to geographic areas of interest. Alerts can run in the background and trigger to warn you of shifts or events as they are taking place. With Splunk you can immediately recognize and react to changing trends and shifting public opinion as expressed through social media, and to new patterns of eCommerce and customer behavior. The ability to immediately recognize and react to changing trends provides a tremendous advantage in today's fast-paced world of Internet business. Big Data Analytics Using Splunk opens the door to an exciting world of real-time operational intelligence.Built around hands-on projects Shows how to mine social media Opens the door to real-time operational intelligence What you'll learn Monitor and mine social media for trends affecting your business Know how you are perceived, and when that perception is rising or falling Detect changing customer behavior from mining your operational data Collect and analyze in real time, or from historical files Apply basic analytical metrics to better understand your data Create compelling visualizations and easily communicate your findings Who this book is for Big Data Analytics Using Splunk is for those who are interested in exploring the heaps of data they have available, but don't know where to start. It is for the people who have knowledge of the data they want to analyze and are developers or SQL programmers at a level anywhere between beginners and intermediate. Expert developers also benefit from learning how to use such a simple and powerful tool as Splunk.
This book focuses on research aspects of ensemble approaches of machine learning techniques that can be applied to address the big data problems. In this book, various advancements of machine learning algorithms to extract data-driven decisions from big data in diverse domains such as the banking sector, healthcare, social media, and video surveillance are presented in several chapters. Each of them has separate functionalities, which can be leveraged to solve a specific set of big data applications. This book is a potential resource for various advances in the field of machine learning and data science to solve big data problems with many objectives. It has been observed from the literature that several works have been focused on the advancement of machine learning in various fields like biomedical, stock prediction, sentiment analysis, etc. However, limited discussions have been carried out on application of advanced machine learning techniques in solving big data problems.
If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections. You'll quickly understand how Hadoop's projects, subprojects, and related technologies work together. Each chapter introduces a different topic-such as core technologies or data transfer-and explains why certain components may or may not be useful for particular needs. When it comes to data, Hadoop is a whole new ballgame, but with this handy reference, you'll have a good grasp of the playing field. Topics include: Core technologies-Hadoop Distributed File System (HDFS), MapReduce, YARN, and Spark Database and data management-Cassandra, HBase, MongoDB, and Hive Serialization-Avro, JSON, and Parquet Management and monitoring-Puppet, Chef, Zookeeper, and Oozie Analytic helpers-Pig, Mahout, and MLLib Data transfer-Scoop, Flume, distcp, and Storm Security, access control, auditing-Sentry, Kerberos, and Knox Cloud computing and virtualization-Serengeti, Docker, and Whirr
How can you get your data from frontend servers to Hadoop in near real time? With this complete reference guide, you'll learn Flume's rich set of features for collecting, aggregating, and writing large amounts of streaming data to the Hadoop Distributed File System (HDFS), Apache HBase, SolrCloud, Elastic Search, and other systems. Using Flume shows operations engineers how to configure, deploy, and monitor a Flume cluster, and teaches developers how to write Flume plugins and custom components for their specific use-cases. You'll learn about Flume's design and implementation, as well as various features that make it highly scalable, flexible, and reliable. Code examples and exercises are available on GitHub. Learn how Flume provides a steady rate of flow by acting as a buffer between data producers and consumers Dive into key Flume components, including sources that accept data and sinks that write and deliver it Write custom plugins to customize the way Flume receives, modifies, formats, and writes data Explore APIs for sending data to Flume agents from your own applications Plan and deploy Flume in a scalable and flexible way - and monitor your cluster once it's running
There has been intense excitement in recent years around activities labeled "data science," "big data," and "analytics." However, the lack of clarity around these terms and, particularly, around the skill sets and capabilities of their practitioners has led to inefficient communication between "data scientists" and the organizations requiring their services. This lack of clarity has frequently led to missed opportunities. To address this issue, we surveyed several hundred practitioners via the Web to explore the varieties of skills, experiences, and viewpoints in the emerging data science community. We used dimensionality reduction techniques to divide potential data scientists into five categories based on their self-ranked skill sets (Statistics, Math/Operations Research, Business, Programming, and Machine Learning/Big Data), and four categories based on their self-identification (Data Researchers, Data Businesspeople, Data Engineers, and Data Creatives). Further examining the respondents based on their division into these categories provided additional insights into the types of professional activities, educational background, and even scale of data used by different types of Data Scientists. In this report, we combine our results with insights and data from others to provide a better understanding of the diversity of practitioners, and to argue for the value of clearer communication around roles, teams, and careers.
Advanced Data Science and Analytics with Python enables data scientists to continue developing their skills and apply them in business as well as academic settings. The subjects discussed in this book are complementary and a follow-up to the topics discussed in Data Science and Analytics with Python. The aim is to cover important advanced areas in data science using tools developed in Python such as SciKit-learn, Pandas, Numpy, Beautiful Soup, NLTK, NetworkX and others. The model development is supported by the use of frameworks such as Keras, TensorFlow and Core ML, as well as Swift for the development of iOS and MacOS applications. Features: Targets readers with a background in programming, who are interested in the tools used in data analytics and data science Uses Python throughout Presents tools, alongside solved examples, with steps that the reader can easily reproduce and adapt to their needs Focuses on the practical use of the tools rather than on lengthy explanations Provides the reader with the opportunity to use the book whenever needed rather than following a sequential path The book can be read independently from the previous volume and each of the chapters in this volume is sufficiently independent from the others, providing flexibility for the reader. Each of the topics addressed in the book tackles the data science workflow from a practical perspective, concentrating on the process and results obtained. The implementation and deployment of trained models are central to the book. Time series analysis, natural language processing, topic modelling, social network analysis, neural networks and deep learning are comprehensively covered. The book discusses the need to develop data products and addresses the subject of bringing models to their intended audiences - in this case, literally to the users' fingertips in the form of an iPhone app. About the Author Dr. Jesus Rogel-Salazar is a lead data scientist in the field, working for companies such as Tympa Health Technologies, Barclays, AKQA, IBM Data Science Studio and Dow Jones. He is a visiting researcher at the Department of Physics at Imperial College London, UK and a member of the School of Physics, Astronomy and Mathematics at the University of Hertfordshire, UK.
A comprehensive compilation of new developments in data linkage methodology The increasing availability of large administrative databases has led to a dramatic rise in the use of data linkage, yet the standard texts on linkage are still those which describe the seminal work from the 1950-60s, with some updates. Linkage and analysis of data across sources remains problematic due to lack of discriminatory and accurate identifiers, missing data and regulatory issues. Recent developments in data linkage methodology have concentrated on bias and analysis of linked data, novel approaches to organising relationships between databases and privacy-preserving linkage. Methodological Developments in Data Linkage brings together a collection of contributions from members of the international data linkage community, covering cutting edge methodology in this field. It presents opportunities and challenges provided by linkage of large and often complex datasets, including analysis problems, legal and security aspects, models for data access and the development of novel research areas. New methods for handling uncertainty in analysis of linked data, solutions for anonymised linkage and alternative models for data collection are also discussed. Key Features : Presents cutting edge methods for a topic of increasing importance to a wide range of research areas, with applications to data linkage systems internationally Covers the essential issues associated with data linkage today Includes examples based on real data linkage systems, highlighting the opportunities, successes and challenges that the increasing availability of linkage data provides Novel approach incorporates technical aspects of both linkage, management and analysis of linked data This book will be of core interest to academics, government employees, data holders, data managers, analysts and statisticians who use administrative data. It will also appeal to researchers in a variety of areas, including epidemiology, biostatistics, social statistics, informatics, policy and public health.
Even as big data is turning the world upside down, the next phase of the revolution is already taking shape: real-time data analysis. This hands-on guide introduces you to Storm, a distributed, JVM-based system for processing streaming data. Through simple tutorials, sample Java code, and a complete real-world scenario, you'll learn how to build fast, fault-tolerant solutions that process results as soon as the data arrives. Discover how easy it is to set up Storm clusters for solving various problems, including continuous data computation, distributed remote procedure calls, and data stream processing. Learn how to program Storm components: "spouts" for data input and "bolts" for data transformation Discover how data is exchanged between spouts and bolts in a Storm "topology" Make spouts fault-tolerant with several commonly used design strategies Explore bolts--their life cycle, strategies for design, and ways to implement them Scale your solution by defining each component's level of parallelism Study a real-time web analytics system built with Node.js, a Redis server, and a Storm topology Write spouts and bolts with non-JVM languages such as Python, Ruby, and Javascript
What is latent class analysis? If you asked that question thirty or forty years ago you would have gotten a different answer than you would today. Closer to its time of inception, latent class analysis was viewed primarily as a categorical data analysis technique, often framed as a factor analysis model where both the measured variable indicators and underlying latent variables are categorical. Today, however, it rests within much broader mixture and diagnostic modeling framework, integrating measured and latent variables that may be categorical and/or continuous, and where latent classes serve to define the subpopulations for whom many aspects of the focal measured and latent variable model may differ. For latent class analysis to take these developmental leaps required contributions that were methodological, certainly, as well as didactic. Among the leaders on both fronts was C. Mitchell "Chan" Dayton, at the University of Maryland, whose work in latent class analysis spanning several decades helped the method to expand and reach its current potential. The current volume in the Center for Integrated Latent Variable Research (CILVR) series reflects the diversity that is latent class analysis today, celebrating work related to, made possible by, and inspired by Chan's noted contributions, and signaling the even more exciting future yet to come. |
You may like...
Microsystem Technology and Microrobotics
Sergej Fatikow, Ulrich Rembold
Hardcover
R4,239
Discovery Miles 42 390
Advanced Transaction Models and…
Sushil Jajodia, Larry Kerschberg
Hardcover
R4,226
Discovery Miles 42 260
Green IT Engineering: Social, Business…
Vyacheslav Kharchenko, Yuriy Kondratenko, …
Hardcover
R4,127
Discovery Miles 41 270
Grammatical and Syntactical Approaches…
Juhyun Lee, Michael J. Ostwald
Hardcover
R5,315
Discovery Miles 53 150
Modern Embedded Computing - Designing…
Peter Barry, Patrick Crowley
Paperback
R1,661
Discovery Miles 16 610
Worst-Case Execution Time Aware…
Paul Lokuciejewski, Peter Marwedel
Hardcover
R4,157
Discovery Miles 41 570
Proceedings Of The Sixth Asian Logic…
Chitat Chong, Mariko Yasugi, …
Paperback
R2,521
Discovery Miles 25 210
|