ICS C165: Foundations of Data Science and Artificial Intelligence
Item | Value |
---|---|
Curriculum Committee Approval Date | 12/06/2024 |
Top Code | 070700 - Computer Software Development |
Units | 3 Total Units |
Hours | 54 Total Hours (Lecture Hours 54) |
Total Outside of Class Hours | 0 |
Course Credit Status | Credit: Degree Applicable (D) |
Material Fee | No |
Basic Skills | Not Basic Skills (N) |
Repeatable | No |
Open Entry/Open Exit | No |
Grading Policy | Standard Letter (S),
|
Course Description
This course introduces the foundational concepts, methods, and tools of data science and artificial intelligence (AI). Topics include data collection and cleaning, exploratory data analysis, statistical and data modeling techniques, data visualization, and the ethical implications of AI. Students will develop a strong foundation in understanding how data can be leveraged to make predictions and informed decisions across various domains. This course combines theory and practical applications, equipping students with the skills needed to analyze data, understand AI’s potential and limitations, and create basic models that can learn from data. Transfer Credit: CSU.
Course Level Student Learning Outcome(s)
- Describe the components of the data science and artificial intelligence lifecycles.
- Explain the ethical considerations in data science and artificial intelligence, including data privacy and
- bias.
- Propose a data science or artificial intelligence solution to address an organizational challenge based
- on a given scenario.
Course Objectives
- 1. Describe the stages of the data science lifecycle, from data collection and preparation to analysis, interpretation, and reporting.
- 2. Explain how to apply data cleaning techniques to handle missing values, remove duplicates, and ensure consistency in datasets.
- 3. Understand and discuss ethical issues in data science, including data privacy, informed consent, and bias mitigation.
- 4. Provide a definition for statistical techniques such as descriptive statistics, hypothesis testing, and probability to analyze data and identify trends.
- 5. Demonstrate how to write and execute basic code in a programming language like Python or R for data manipulation, analysis, and visualization.
- 6. Provide examples of methods to approach data-driven problems systematically, to develop a clear question, analyze data, and interpret results to reach conclusions.
- 7. Provide information about career pathways in data science.
Lecture Content
Introduction to Data Science Overview of data science and its applications in various fields (e.g., business, healthcare, finance, etc.) The data science lifecycle: problem identification, collect data, process data, explore data, model development, model evaluation, deployment and enhancement Common tools and languages used in data science (e.g., Python, R, SQL) Introduction to Artificial Intelligence (AI) Overview of AI and its applications in various fields (e.g., business, healthcare, finance, etc.) The machine learning lifecycle: problem definition, data collection, data cleaning and preprocessing, exploratory data analysis (EDA), feature engineering and selection, model selection, model training, model evaluation and tuning, model deployment, model monitoring and maintenance Common tools and languages used in AI Types of Data and Data Collection Methods Types of data: structured, unstructured, quantitative, qualitative Sources of data: databases, APIs, surveys, experiments, and web scraping Basic data collection techniques and considerations for quality data Data Cleaning and Preparation Introduction to data cleaning and preprocessing Handling missing values, duplicates, and inconsistencies Data transformation techniques: normalization, scaling, and encoding Tools for data wrangling Exploratory Data Analysis (EDA) Descriptive statistics: mean, median, mode, variance, and standard deviation Visualizing data distributions and relationships Identifying outliers and patterns in data EDA tools and libraries for visual and statistical exploration Data Visualization Principles of effective data visualization: clarity, accuracy, and storytelling Common visualization types: histograms, scatter plots, line graphs, and bar charts Introduction to data visualization libraries (e.g., Matplotlib, Seaborn in Python or ggplot2 in R) Creating dashboards and reports for data communication Introduction to Data Modeling Techniques Probability based models Classification based models Regression based models Basic Programming for Data Science & AI Introduction to programming languages for data science & AI Data structures: lists, dictionaries, arrays, and data frames Writing and debugging basic functions and scripts Importing, exporting, and manipulating data programmatically Introduction to Machine Learning Concepts Overview of supervised vs. unsupervised learning Basic machine learning algorithms: linear regression, clustering Introduction to neural networks Model evaluation metrics: accuracy, precision, recall Introduction to machine learning libraries (e.g., Scikit-learn in Python or caret in R) Data Ethics and Responsible Data Use Introduction to data ethics: privacy, security, and consent Bias and fairness in data science and machine learning Responsible data collection and ethical considerations in data science and AI Model Evaluation Performance metrics Cross-validation Bias-variance tradeoff Model Deployment Model serving and infrastructure Monitoring and maintenance Security and compliance Model Enhancement Feature engineering and selection Hyperparameter tuning Ensemble methods Case Study and Hands-on Project Working with a real or simulated dataset to solve a data problem Applying the data science lifecycle: from data collection to reporting insights Hands-on practice in data cleaning, analysis, visualization, and interpretation Presenting findings and communicating insights effectively Interpreting and Communicating Data Insights Best practices for written and visual communication of data findings Storytelling with data: structuring narratives around insights Introduction to report writing and presentation tools (e.g., Jupyter notebooks, PowerPoint) Examples of effective data storytelling
Method(s) of Instruction
- Lecture (02)
- DE Live Online Lecture (02S)
- DE Online Lecture (02X)
Instructional Techniques
This course will utilize a combination of lecture, hands-on guided assignments, classroom/discussion student interactions, problem solving, quizzes, tests, and troubleshooting assignments to achieve the goals and objectives of this course. All instructional methods are consistent across all modalities.
Reading Assignments
Students will read about data science topics and data visualization techniques. Students will read about the data science lifecycle: from data collection to reporting insights. Students will read about bias and ethical and social implications of applying data science techniques
Writing Assignments
Students will complete written reports related to data science concepts such as data cleaning and data visualization. Students will complete written reports related to popular data science programming tools and libraries. Students will complete written reports related to bias and ethical and social implications of data science.
Out-of-class Assignments
Students will practice data science skills, including data cleaning, analysis, visualization, and interpretation. Students will complete hands-on assignments related to data science such as installation of popular data science programming tools and libraries. Student will work in a lab environment to complete the following hands-on assignments: Introduction to AI Tools and Libraries Install and set up popular AI libraries like TensorFlow or PyTorch. Implement a basic machine learning algorithm (e.g., linear regression or k-nearest neighbors) using the chosen library. Load a sample dataset, preprocess the data, and train the model. Evaluate the model's performance and visualize the results. Natural Language Processing (NLP) Implement a text tokenization algorithm to break down sentences into words. Perform text preprocessing tasks such as stemming, lemmatization, and stop-word removal. Build a sentiment analysis classifier using a dataset of movie reviews or social media posts. Implement a basic chatbot using rule-based or machine learning approaches. Computer Vision Load and display images using a programming language like Python. Apply basic image processing techniques such as blurring, edge detection, and resizing. Implement object detection using pre-trained models like YOLO (You Only Look Once) or SSD (Single Shot MultiBox Detector). Develop a facial recognition system using Haar cascades or deep learning models. Reinforcement Learning Implement a simple reinforcement learning problem, such as the Multi-Armed Bandit problem. Develop a Q-learning algorithm for solving a grid-world problem. Experiment with deep reinforcement learning using algorithms like Deep Q-Networks (DQN). Design and implement a reinforcement learning agent to play a game, such as Tic-Tac-Toe or Flappy Bird. AI Ethics and Bias Analyze a real-world case study involving ethical implications of AI technology. Discuss bias in AI datasets and algorithms, and explore techniques for bias mitigation. Debate ethical dilemmas related to AI applications in areas like healthcare, autonomous vehicles, or criminal justice. Propose strategies for responsible AI development and deployment. AI Project Select a project topic related to AI (e.g., healthcare, finance, gaming). Define the problem, collect and preprocess data, and choose appropriate AI techniques for the project. Implement the AI solution and evaluate its performance using relevant metrics. Present the project outcomes, challenges faced, and lessons learned to the class.
Demonstration of Critical Thinking
Students will apply critical thinking skills through the implementation of data science programming tools and libraries. Students will demonstrate critical thinking skills by loading a sample dataset, cleaning the data, and preparing a data visualization.
Required Writing, Problem Solving, Skills Demonstration
Students will complete written reports related to sources of data, such as databases, APIs, surveys, experiments, and web scraping. Students will complete written reports related to popular data science programming tools and libraries. Students will complete written reports related to bias and ethical and social implications of data science. Students will complete hands-on assignments related to data science such as installation of popular programming tools and libraries.
Eligible Disciplines
Computer information systems (computer network installation, microcomputer ...: Any bachelor's degree and two years of professional experience, or any associate degree and six years of professional experience. Computer science: Master's degree in computer science or computer engineering OR bachelor's degree in either of the above AND master's degree in mathematics, cybernetics, business administration, accounting or engineering OR bachelor's degree in engineering AND master's degree in cybernetics, engineering mathematics, or business administration OR bachelor's degree in mathematics AND master's degree in cybernetics, engineering mathematics, or business administration OR bachelor's degree in any of the above AND a master's degree in information science, computer information systems, or information systems OR the equivalent. Note: Courses in the use of computer programs for application to a particular discipline may be classified, for the minimum qualification purposes, under the discipline of the application. Master's degree required. Computer service technology: Any bachelor's degree and two years of professional experience, or any associate degree and six years of professional experience.
Textbooks Resources
1. Required Davies, S.. The Crystal Ball Instruction Method; Volume One: Introduction to Data Science, 1.1 ed. James Farmer Hall, 2021 Legacy Textbook Transfer Data: Open Education Resource
Other Resources
1. GitHub Digital Resources 2. Coastline Library 3. White papers, security reports, and articles are available at no charge to all students at multiple sites as recommended by the instructor.