Showing posts with label PP V. Show all posts
Showing posts with label PP V. Show all posts

Saturday, October 18, 2025

WEEK 5 Program's

 

UNIT- V

Introduction to Data Science

Data Science

  • Data Science is the study of data to extract meaningful insights for decision-making.
  • It combines techniques from Statistics, Computer Science, and Domain Knowledge to analyze, visualize, and predict outcomes.
  • It involves collecting, cleaning, analyzing, and interpreting data to solve real-world problems.

Importance of Data Science

  • Helps organizations make data-driven decisions.
  • Enables automation and predictions using Machine Learning.
  • Supports business intelligence and strategic planning.
  • Plays a key role in fields like healthcare, finance, e-commerce, and social media.

Components of Data Science

  1. Data Collection – Gathering data from various sources (databases, web, sensors, etc.)
  2. Data Cleaning – Removing errors, duplicates, and missing values.
  3. Data Analysis – Using statistical methods and visualization to explore data.
  4. Data Visualization – Representing data using graphs, charts, and dashboards.
  5. Machine Learning – Building models to predict or classify data outcomes.
  6. Communication of Results – Presenting insights to decision-makers.

Data Science Workflow

  1. Define the Problem
  2. Collect Data
  3. Prepare Data (Cleaning and Transformation)
  4. Analyze & Build Model
  5. Evaluate Model Performance
  6. Deploy & Monitor the Model

Tools and Technologies Used

Category

Tools/Technologies

Programming Languages

Python, R

Data Handling

SQL, Pandas, NumPy

Visualization

Matplotlib, Seaborn, Power BI, Tableau

Machine Learning

Scikit-learn, TensorFlow, PyTorch

Big Data

Hadoop, Spark

 

Applications of Data Science

  • Healthcare – Disease prediction, drug discovery
  • Finance – Fraud detection, stock market analysis
  • E-commerce – Product recommendation systems
  • Social Media – Sentiment analysis, targeted advertising
  • Transportation – Route optimization, autonomous vehicles

Skills Required for Data Scientists

  • Programming skills (Python/R)
  • Mathematics & Statistics
  • Data Visualization
  • Machine Learning
  • Communication & Problem-solving skills

Careers in Data Science

  • Data Analyst
  • Data Engineer
  • Machine Learning Engineer
  • Data Scientist
  • Business Intelligence Analyst

Functional Programming in Python

Introduction

·         Functional Programming (FP) is a programming paradigm where programs are built using functions.

·         It focuses on what to solve rather than how to solve it.

·         Python supports both Object-Oriented and Functional programming styles (it’s a multi-paradigm language).

Key Concepts

Concept

Description

Function

A block of code that performs a specific task and can be reused.

Pure Function

A function that always produces the same output for the same input and has no side effects.

Immutability

Data is not changed; instead, new data is created.

First-Class Functions

Functions can be assigned to variables, passed as arguments, or returned from other functions.

Higher-Order Functions

Functions that take other functions as arguments or return them as results.

 

Advantages of Functional Programming

·         Easier to debug and test.

·         Promotes code reusability.

·         Supports parallel and distributed computing.

·         Produces clean and modular code.

Functional Programming Features in Python

Built-in Functions

Python provides many built-in functional tools like:

·         map()

·         filter()

·         reduce()

·         lambda (anonymous function)

Lambda Functions

·         Small, anonymous functions created using the lambda keyword.

·         Syntax:

·         lambda arguments: expression

·         Example:

·         square = lambda x: x * x
·         print(square(5))  # Output: 25

 

map() Function

·         Applies a function to each item in an iterable (like a list).

·         numbers = [1, 2, 3, 4, 5]
·         squares = list(map(lambda x: x*x, numbers))
·         print(squares)  # Output: [1, 4, 9, 16, 25]

 

filter() Function

·         Filters elements from an iterable using a Boolean condition.

·         numbers = [1, 2, 3, 4, 5, 6]
·         even = list(filter(lambda x: x % 2 == 0, numbers))
·         print(even)  # Output: [2, 4, 6]

 

reduce() Function

·         Used to reduce a list to a single value by repeatedly applying a function.

·         It is available in the functools module.

·         from functools import reduce
·         numbers = [1, 2, 3, 4, 5]
·         product = reduce(lambda x, y: x * y, numbers)
·         print(product)  # Output: 120

 

Example: Combining Functional Tools

from functools import reduce
numbers = [1, 2, 3, 4, 5, 6]
result = reduce(lambda x, y: x + y,
                filter(lambda x: x % 2 == 0,
                       map(lambda x: x * x, numbers)))
print(result)  # Output: 56 (2² + 4² + 6²)

 

JSON and XML in Python

Introduction

Data is often exchanged between applications using structured formats.
Two commonly used data formats are:

  • JSON (JavaScript Object Notation)
  • XML (eXtensible Markup Language)

Python provides libraries to read, write, and process both easily.

 

JSON in Python

 

What is JSON?

  • JSON stands for JavaScript Object Notation.
  • It is a lightweight data format used to store and exchange data between systems.
  • It is easy for humans to read and easy for machines to parse.

 

JSON Structure

JSON data is written as key–value pairs.
Example:

{

  "name": "Madhu",

  "age": 25,

  "department": "CSE",

  "skills": ["Python", "Data Science"]

}

 

JSON vs Python Dictionary

 

JSON

Python

String format

Dictionary object

Uses double quotes

Uses single or double quotes

Can be stored in files

Used within programs

 

 

Working with JSON in Python

Python provides the built-in json module.

 

a) Importing JSON Module

import json

 

b) Converting Python Object to JSON

(Serialization – using json.dumps() or json.dump())

import json

data = {"name": "Madhu", "age": 25, "city": "Kurnool"}

json_string = json.dumps(data)

print(json_string)

 

c) Converting JSON to Python Object

(Deserialization – using json.loads() or json.load())

import json

json_data = '{"name": "Madhu", "age": 25, "city": "Kurnool"}'

python_obj = json.loads(json_data)

print(python_obj["name"])  # Output: Madhu

 

d) Reading JSON from a File

with open('data.json', 'r') as file:

    data = json.load(file)

 

e) Writing JSON to a File

with open('data.json', 'w') as file:

    json.dump(data, file)

XML in Python

What is XML?

  • XML (eXtensible Markup Language) is a markup language used to store and transport data.
  • It uses tags (like HTML) to define elements and their structure.

 

Example:

<student>

    <name>Madhu</name>

    <age>25</age>

    <department>CSE</department>

</student>

 

Features of XML

  • Self-descriptive and hierarchical.
  • Platform-independent.
  • Used in many web and data exchange applications.

 

Parsing XML in Python

Python provides the xml.etree.ElementTree module to parse and create XML data.

 

a) Reading XML Data

import xml.etree.ElementTree as ET

tree = ET.parse('student.xml')

root = tree.getroot()

print(root.tag)  # Output: student

for child in root:

    print(child.tag, ":", child.text)

 

b) Creating XML Data

import xml.etree.ElementTree as ET

student = ET.Element('student')

name = ET.SubElement(student, 'name')

name.text = 'Madhu'

age = ET.SubElement(student, 'age')

age.text = '25'

tree = ET.ElementTree(student)

tree.write('student.xml')

  

JSON vs XML – Comparison

Feature

JSON

XML

Simplicity

Simple and compact

More verbose

Data Type

Supports arrays and objects

Only text-based data

Readability

Easy for humans

Harder to read

Parsing

Faster

Slower

Use Case

APIs, web applications

Documents, configurations

 

  • JSON and XML are formats for data storage and exchange.
  • JSON is lightweight and widely used in web APIs.
  • XML is more structured and descriptive, useful for hierarchical data.
  • Python provides built-in modules — json and xml.etree.ElementTree — to easily work with both.

 

NumPy with Python

 

Introduction to NumPy

  • NumPy stands for Numerical Python.
  • It is a powerful library used for numerical and scientific computing.
  • It provides support for multidimensional arrays, mathematical operations, and linear algebra.
  • Widely used in Data Science, Machine Learning, and Scientific Applications.

 

Why Use NumPy?

Python lists are slow and inefficient for numerical operations.
NumPy arrays are:

  • Faster and more memory-efficient
  • Allow vectorized operations (no need for loops)
  • Integrated with many scientific and ML libraries (Pandas, Scikit-learn, TensorFlow)

 

Installing NumPy

Before using NumPy, install it using:

pip install numpy

Then import it in Python:

import numpy as np

 

NumPy Arrays

The core of NumPy is the ndarray (N-dimensional array) object.

Creating Arrays

import numpy as np

 

# From list

arr = np.array([1, 2, 3, 4, 5])

print(arr)

 

# Multi-dimensional array

matrix = np.array([[1, 2, 3], [4, 5, 6]])

print(matrix)

 

Array Attributes

 

Attribute

Description

Example

ndim

Number of dimensions

arr.ndim

shape

Number of rows and columns

arr.shape

size

Total number of elements

arr.size

dtype

Data type of elements

arr.dtype

 

Example:

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.ndim)   # 2

print(arr.shape)  # (2, 3)

print(arr.size)   # 6

 

Creating Arrays with Built-in Functions

 

Function

Description

Example

np.zeros()

Creates array of zeros

np.zeros((2,3))

np.ones()

Creates array of ones

np.ones((2,3))

np.arange()

Creates array with range of values

np.arange(0,10,2)

np.linspace()

Creates evenly spaced values

np.linspace(0,1,5)

np.eye()

Identity matrix

np.eye(3)

np.random.rand()

Random values between 0 and 1

np.random.rand(2,3)

 

Array Indexing and Slicing

You can access and modify array elements easily.

arr = np.array([10, 20, 30, 40, 50])

print(arr[0])     # First element

print(arr[1:4])   # Slicing elements

arr[2] = 100      # Modify element

print(arr)

For 2D arrays:

matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])

print(matrix[1,2])   # Element at 2nd row, 3rd column

print(matrix[:,1])   # All rows, 2nd column

 

Array Operations

NumPy supports element-wise arithmetic operations.

a = np.array([1,2,3])

b = np.array([4,5,6])

 

print(a + b)  # [5,7,9]

print(a - b)  # [-3,-3,-3]

print(a * b)  # [4,10,18]

print(a / b)  # [0.25,0.4,0.5]

 

Also supports:

  • np.sum(a) – Sum of elements
  • np.mean(a) – Mean value
  • np.max(a) / np.min(a) – Max/Min element
  • np.sqrt(a) – Square root
  • np.dot(a, b) – Dot product

 

 Array Reshaping

arr = np.arange(6)

print(arr.reshape(2,3))  # Reshape 1D → 2D

 

Combining and Splitting Arrays

a = np.array([[1,2],[3,4]])

b = np.array([[5,6]])

 

# Vertical stacking

print(np.vstack((a,b)))

 

# Horizontal stacking

print(np.hstack((a,b.T)))

 

Broadcasting

Allows arithmetic between arrays of different shapes.

a = np.array([[1,2,3],[4,5,6]])

b = np.array([10,20,30])

print(a + b)

 

 

Mathematical and Statistical Functions

 

Function

Description

np.mean(a)

Average of elements

np.median(a)

Median value

np.std(a)

Standard deviation

np.var(a)

Variance

np.sum(a)

Sum of elements

np.sqrt(a)

Square root

 

Example Program

import numpy as np

 

data = np.array([[2, 4, 6], [1, 3, 5]])

print("Original Array:\n", data)

print("Mean:", np.mean(data))

print("Max:", np.max(data))

print("Sum of each column:", np.sum(data, axis=0))

 

Applications of NumPy

  • Data Science – data manipulation and preprocessing
  • Machine Learning – matrix operations
  • Image Processing – pixel data manipulation
  • Scientific Computing – solving mathematical equations
  • Statistics & Probability – analyzing datasets

 

Summary

  • NumPy provides high-performance multi-dimensional arrays.
  • It replaces slow Python lists with efficient numerical computations.
  • Essential for Data Science, Machine Learning, and AI.

 

Pandas in Python

 

Introduction

  • Pandas are a powerful and popular Python library for data manipulation and analysis.
  • It provides high-performance data structures and data analysis tools.
  • The name “Pandas” comes from “Panel Data”, a term used in statistics.

 

Why Pandas?

Pandas make it easy to:

  • Handle and analyze tabular data (like Excel or CSV files).
  • Perform data cleaning, filtering, grouping, and aggregation.
  • Integrate seamlessly with NumPy, Matplotlib, and Scikit-learn.
  • Work with large datasets efficiently.

 

Installing Pandas

pip install pandas

Import it in Python:

import pandas as pd

 

Data Structures in Pandas

Pandas provide two main data structures:

 

Data Structure

Description

Example

Series

1D labeled array (like a column in Excel)

pd.Series()

DataFrame

2D labeled data (like a spreadsheet)

pd.DataFrame()

 

Pandas Series

A Series is like a one-dimensional array with labels (index).

import pandas as pd

data = pd.Series([10, 20, 30, 40])

print(data)

 

Output:

0    10

1    20

2    30

3    40

dtype: int64

Custom index:

data = pd.Series([100, 200, 300], index=['a', 'b', 'c'])

print(data['b'])  # Output: 200

 

Pandas DataFrame

A DataFrame is a two-dimensional table of data with rows and columns.

 

import pandas as pd

 

data = {

    'Name': ['Madhu', 'Latha', 'Ravi'],

    'Age': [22, 21, 23],

    'Dept': ['CSE', 'ECE', 'IT']

}

 

df = pd.DataFrame(data)

print(df)

 

Output:

    Name  Age Dept

0  Madhu   22  CSE

1  Latha   21  ECE

2   Ravi   23   IT

 

Reading and Writing Data

Pandas can read and write data from different file formats.

 

File Type

Function to Read

Function to Write

CSV

pd.read_csv()

to_csv()

Excel

pd.read_excel()

to_excel()

JSON

pd.read_json()

to_json()

SQL

pd.read_sql()

to_sql()

Example:

df = pd.read_csv('students.csv')

df.to_excel('students.xlsx', index=False)

 

DataFrame Operations

 

a) Viewing Data

df.head()        # First 5 rows

df.tail(3)       # Last 3 rows

df.info()        # Summary of DataFrame

df.describe()    # Statistical summary

df.shape         # (rows, columns)

 

b) Selecting Data

df['Name']       # Select single column

df[['Name','Age']]  # Multiple columns

df.iloc[0]       # Select by row index

df.loc[1, 'Name']  # Select specific cell

 

Filtering and Conditional Selection

df[df['Age'] > 21]

df[(df['Dept'] == 'CSE') & (df['Age'] > 21)]

 

Adding and Removing Columns

df['Marks'] = [85, 90, 88]      # Add new column

df.drop('Dept', axis=1, inplace=True)  # Remove column

 

Handling Missing Data

df.isnull()         # Check for missing values

df.dropna()         # Drop rows with null values

df.fillna(0)        # Replace nulls with 0

 

Sorting and Grouping Data

df.sort_values(by='Age', ascending=False)

df.groupby('Dept')['Marks'].mean()

 

Merging, Joining, and Concatenation

a) Merging

pd.merge(df1, df2, on='ID')

 

b) Concatenation

pd.concat([df1, df2])

 

Statistical and Mathematical Operations

 

df['Age'].mean()

df['Marks'].max()

df['Marks'].sum()

df.corr()   # Correlation matrix

 

Example Program

import pandas as pd

 

data = {

    'Student': ['A', 'B', 'C', 'D'],

    'Marks': [85, 90, 78, 92],

    'Department': ['CSE', 'IT', 'CSE', 'ECE']

}

 

df = pd.DataFrame(data)

print("Data:\n", df)

print("\nAverage Marks:", df['Marks'].mean())

print("\nCSE Students:\n", df[df['Department'] == 'CSE'])

Output:

Data:

  Student  Marks Department

0       A     85        CSE

1       B     90         IT

2       C     78        CSE

3       D     92        ECE

 

Average Marks: 86.25

CSE Students:

  Student  Marks Department

0       A     85        CSE

2       C     78        CSE

 

Applications of Pandas

  • Data Cleaning and Preparation
  • Statistical Analysis
  • Data Visualization (with Matplotlib/Seaborn)
  • Machine Learning Preprocessing
  • Financial Data Analysis

 

Important points

  • Pandas are the backbone of Data Science in Python.
  • Provides easy handling of structured data.
  • Supports file I/O, filtering, grouping, and analytics.
  • Works well with NumPy and Matplotlib.

 

In short:
NumPy = Numerical computations
Pandas = Data handling and analysis
Matplotlib/Seaborn = Data visualization

 

Plotting Graphs Using Pandas in Python

 

Introduction

  • Pandas provide built-in data visualization features using the Matplotlib library.
  • It allows us to create different types of plots and charts directly from Series or Data Frame objects.
  • Helps in understanding patterns, trends, and relationships in data visually.

 

Importing Required Libraries

Before plotting, import the necessary libraries:

import pandas as pd

import matplotlib.pyplot as plt

 

Note: If Matplotlib is not installed, install it using: pip install matplotlib

 

Creating a Simple DataFrame

Let’s create some data first:

import pandas as pd

 

data = {

    'Year': [2020, 2021, 2022, 2023, 2024],

    'Sales': [200, 250, 300, 350, 400],

    'Profit': [20, 25, 30, 28, 35]

}

 

df = pd.DataFrame(data)

print(df)

Output:

   Year  Sales  Profit

0  2020    200      20

1  2021    250      25

2  2022    300      30

3  2023    350      28

4  2024    400      35

 

Line Plot

A line plot is used to display data changes over a period of time.

df.plot(x='Year', y='Sales', kind='line', title='Yearly Sales', color='blue', marker='o')

plt.xlabel('Year')

plt.ylabel('Sales')

plt.grid(True)

plt.show()

 

Explanation:

  • kind='line' → line chart
  • x and y define which columns to use
  • marker='o' → shows data points on the line

 

Bar Plot

Used to compare categories or quantities.

df.plot(x='Year', y='Profit', kind='bar', title='Yearly Profit', color='orange')

plt.xlabel('Year')

plt.ylabel('Profit')

plt.show()

 

 

Explanation:

  • Each bar represents a category (here, year).
  • Useful for comparing profits or counts.

 

Multiple Line Plot

 

To compare two columns in one graph:

df.plot(x='Year', y=['Sales', 'Profit'], kind='line', marker='o')

plt.title('Sales vs Profit over Years')

plt.xlabel('Year')

plt.ylabel('Values')

plt.show()

 

 Explanation:

  • Plots both columns on the same graph.
  • Helps to see the relationship between sales and profit.

 

 Histogram

Used to display frequency distribution of numerical data.

df['Sales'].plot(kind='hist', bins=5, color='green', title='Sales Distribution')

plt.xlabel('Sales')

plt.show()

 

Explanation:

  • bins → number of intervals.
  • Useful for analyzing data spread or patterns.

 

Pie Chart

Used to show percentage or proportion of categories.

df['Profit'].plot(kind='pie', labels=df['Year'], autopct='%1.1f%%', startangle=90)

plt.title('Profit Share by Year')

plt.ylabel('')

plt.show()

 

Explanation:

  • autopct → shows percentage values.
  • startangle=90 → starts chart from the top.

 

Scatter Plot

Used to show relationship between two numeric variables.

df.plot(kind='scatter', x='Sales', y='Profit', color='red', title='Sales vs Profit')

plt.xlabel('Sales')

plt.ylabel('Profit')

plt.show()

 

Explanation:

  • Each point represents a (Sales, Profit) pair.
  • Helps identify trends or correlations.

 

Box Plot

Used for statistical analysis (to check data spread and outliers).

df[['Sales', 'Profit']].plot(kind='box', title='Sales and Profit Distribution')

plt.show()

 

Explanation:

  • Shows median, quartiles, and outliers.
  • Useful for understanding data variability.

 

 

Customizing the Graphs

You can enhance the appearance using Matplotlib options:

plt.figure(figsize=(8,5))

df.plot(x='Year', y='Sales', kind='line', color='purple', marker='o', linestyle='--')

plt.title('Customized Sales Graph')

plt.xlabel('Year')

plt.ylabel('Sales')

plt.grid(True)

plt.show()

 

Example: Comparing Multiple Graphs

import matplotlib.pyplot as plt

 

plt.figure(figsize=(10,6))

 

# Line plot

plt.subplot(2,1,1)

plt.plot(df['Year'], df['Sales'], marker='o', color='blue', label='Sales')

plt.plot(df['Year'], df['Profit'], marker='s', color='red', label='Profit')

plt.title('Sales and Profit Comparison')

plt.legend()

 

# Bar plot

plt.subplot(2,1,2)

plt.bar(df['Year'], df['Sales'], color='green')

plt.title('Sales Growth')

 

plt.tight_layout()

plt.show()

 

Types of Plots Supported by Pandas

 

Plot Type

Parameter

Description

Line Plot

'line'

Default plot type

Bar Plot

'bar'

Vertical bars

Barh Plot

'barh'

Horizontal bars

Histogram

'hist'

Data distribution

Box Plot

'box'

Statistical view

Area Plot

'area'

Filled area under line

Pie Chart

'pie'

Category proportions

Scatter Plot

'scatter'

Relation between two variables

 

  • Pandas integrate with Matplotlib to make plotting simple and powerful.
  • Useful for data visualization, trend analysis, and decision-making.
  • Common plots include line, bar, scatter, pie, histogram, and box plots.
  • Helps engineers and analysts visualize complex data clearly and effectively.

 

 

**************************************





WEEK 5

List of Experiments

1. Python program to check whether a JSON string contains complex object or not.

2. Python Program to demonstrate NumPy arrays creation using array () function.

3. Python program to demonstrate use of ndim, shape, size, dtype.

4. Python program to demonstrate basic slicing, integer and Boolean indexing.

5. Python program to find min, max, sum, cumulative sum of array

6. Create a dictionary with at least five keys and each key represent value as a list where this list contains at least ten values and convert this dictionary as a pandas data frame and explore the data through the data frame as follows:

a) Apply head () function to the pandas data frame

b) Perform various data selection operations on Data Frame

7. Select any two columns from the above data frame, and observe the change in one attribute with respect

 

 

 

  Program 1: Python program to check whether a JSON string contains complex object or               not.

 

Method I

CODE:

import json

 

# Sample JSON strings

json_string1 = '{"name": "Maddy", "age": 22, "marks": {"math": 90, "science": 85}}'

json_string2 = '{"name": "Rahul", "age": 20, "city": "Delhi"}'

 

def has_complex_object(json_str):

    try:

        # Convert JSON string to Python object (dictionary)

        data = json.loads(json_str)

       

        # Check for any complex (nested) structure like dict or list

        for value in data.values():

            if isinstance(value, (dict, list)):

                return True

        return False

 

    except json.JSONDecodeError:

        print("Invalid JSON format!")

        return None

 

# Test the function

print("JSON 1:", has_complex_object(json_string1))  # True → contains nested dict

print("JSON 2:", has_complex_object(json_string2))  # False → all values are simple

 

Output:

JSON 1: True

JSON 2: False

 

Explanation:

 

ü  import json → to work with JSON data in Python.

ü  json.loads() → converts JSON string into a Python dictionary.

ü  The program checks each value in the dictionary:

o   If any value is a list or another dictionary, it’s a complex object.

ü  Returns:

o   True → if complex object found

o   False → if all values are simple (string, number, etc.)

 

 

 

Method II

  JSON types: string, number, object (dict), array (list), true/false, null.

Complex numbers are not directly supported (e.g., 3+5j).

So if a JSON string contains something like a complex number, the standard json module will raise an error.

But we can detect whether a JSON string contains a complex object by:

Trying to parse it with json.loads().

If it fails, check if the data contains "j" (imaginary unit).

Or, after parsing, scan values to see if any are complex-like.

 

Program: Detect Complex Object in JSON String

CODE:

 

import json

 

def contains_complex(json_str):

    try:

        data = json.loads(json_str)   # Try parsing JSON

 

        # Recursively check if any value is a complex number

        def check_complex(obj):

            if isinstance(obj, dict):

                return any(check_complex(v) for v in obj.values())

            elif isinstance(obj, list):

                return any(check_complex(i) for i in obj)

            elif isinstance(obj, str):

                # Check if string looks like a complex number (e.g., "3+4j")

                try:

                    complex(obj)   # Attempt conversion

                    return True

                except ValueError:

                    return False

            else:

                return False

        return check_complex(data)

    except json.JSONDecodeError:

        return False

 

# Example JSON strings

json1 = '{"name": "Alice", "age": 25, "number": "3+4j"}'

json2 = '{"x": 10, "y": 20}'

 

print("JSON 1 contains complex:", contains_complex(json1))  # True

print("JSON 2 contains complex:", contains_complex(json2))  # False

 

 

Output:

JSON 1 contains complex: True

JSON 2 contains complex: False

 

Explanation

json.loads(json_str) → Parses JSON into Python dict/list.

 

Example: '{"a": 1}' → {"a": 1} (Python dict).

Recursive function check_complex():

If object is a dict → check all values.

If object is a list → check all items.

If object is a string → try converting to complex().

If conversion succeeds → it’s a complex-like value.

Returns true if any string looks like "a+bj", otherwise False.

 

 Note:

If you really want to store complex numbers in JSON, you need custom encoding (e.g., save as {"real": 3, "imag": 4}).

 

 

Program 2: Python Program to demonstrate NumPy arrays creation using array () function

 

NumPy array() Function

 

What is numpy.array()?

The array() function in NumPy is used to create an ndarray (N-dimensional array) from:

Python lists

Python tuples

Nested sequences (list of lists → matrix)

 

Syntax:

numpy.array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)

 

Parameters:

object → input data (list, tuple, nested list, etc.)

dtype → specify data type (int32, float64, etc.)

copy → if True, copy is created; if False, reference is used if possible

order → memory layout:

'C' = row-major (C-style, default)

'F' = column-major (Fortran-style)

subok → if True, subclasses are passed through

ndmin → minimum number of dimensions

 

 Method I

 CODE:

# Import the NumPy library

import numpy as np

 

# 1 Create a 1-D array (one-dimensional)

arr1 = np.array([10, 20, 30, 40, 50])

print("1-D Array:")

print(arr1)

 

# 2 Create a 2-D array (two-dimensional)

arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print("\n2-D Array:")

print(arr2)

 

# 3 Create a 3-D array (three-dimensional)

arr3 = np.array([

    [[1, 2], [3, 4]],

    [[5, 6], [7, 8]]

])

print("\n3-D Array:")

print(arr3)

 

# 4 Check type and dimension of arrays

print("\nType of arr1:", type(arr1))

print("Dimension of arr1:", arr1.ndim)

print("Dimension of arr2:", arr2.ndim)

print("Dimension of arr3:", arr3.ndim)

 

 Output

1-D Array:

[10 20 30 40 50]

 

2-D Array:

[[1 2 3]

 [4 5 6]]

 

3-D Array:

[[1 2]

  [3 4]]

 

 [[5 6]

  [7 8]]]

 

Type of arr1: <class 'numpy.ndarray'>

Dimension of arr1: 1

Dimension of arr2: 2

Dimension of arr3: 3

 

 Method II

Python Program: Demonstrating numpy.array()

 

CODE:

import numpy as np

 

# 1. Creating 1D array from list

arr1 = np.array([1, 2, 3, 4, 5])

print("1D Array:", arr1)

 

# 2. Creating 2D array (Matrix) from nested list

arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print("\n2D Array:\n", arr2)

 

# 3. Creating array from tuple

arr3 = np.array((10, 20, 30))

print("\nArray from Tuple:", arr3)

 

# 4. Specifying dtype

arr4 = np.array([1, 2, 3], dtype=float)

print("\nArray with dtype float:", arr4)

 

# 5. Using ndmin (minimum dimensions)

arr5 = np.array([1, 2, 3, 4], ndmin=3)

print("\nArray with ndmin=3:\n", arr5)

print("Shape of arr5:", arr5.shape)

 

# 6. Copy parameter

list_data = [1, 2, 3]

arr6 = np.array(list_data, copy=False)

print("\nOriginal List:", list_data)

print("NumPy Array (copy=False):", arr6)

 

# Modify list and check array

list_data[0] = 99

print("Modified List:", list_data)

print("NumPy Array after modifying list:", arr6)  # Will it change?

 

OUTPUT:

1D Array: [1 2 3 4 5]

 

2D Array:

 [[1 2 3]

  [4 5 6]]

 

Array from Tuple: [10 20 30]

 

Array with dtype float: [1. 2. 3.]

 

Array with ndmin=3:

 [[[1 2 3 4]]]

Shape of arr5: (1, 1, 4)

 

Original List: [1, 2, 3]

NumPy Array (copy=False): [1 2 3]

Modified List: [99, 2, 3]

NumPy Array after modifying list: [1 2 3]

 

Notice:

NumPy didn’t update arr6 when the list was modified — because by default, NumPy tries to copy data into its own memory-efficient format.

 If dtype=float is set, integers are automatically converted.

 ndmin=3 creates at least 3D array (extra dimensions are added).


 Key Takeaways

np.array() converts Python lists/tuples into NumPy ndarrays.

Supports dtype conversion, multi-dimensional arrays, and custom memory layouts.

Very efficient compared to Python lists (uses less memory, faster).

 

 

Program 3: Python program to demonstrate use of ndim, shape, size, dtype.v

 

ndim → Number of dimensions (axes) of the array.

 

shape → Tuple of array dimensions (rows, cols, etc.).

 

size → Total number of elements in the array.

 

dtype → Data type of array elements (int32, float64, etc.).

 

Method I

CODE:

# Import NumPy library

import numpy as np

 

# Create a 2D NumPy array

arr = np.array([[10, 20, 30], [40, 50, 60]])

 

# Display the array

print("Array:")

print(arr)

 

# 1 Number of dimensions

print("\nNumber of Dimensions (ndim):", arr.ndim)

 

# 2 Shape of the array (rows, columns)

print("Shape of Array (shape):", arr.shape)

 

# 3 Total number of elements in the array

print("Size of Array (size):", arr.size)

 

# 4 Data type of elements stored in array

print("Data Type of Elements (dtype):", arr.dtype)

 

OUTPUT:

Array:

[[10 20 30]

 [40 50 60]]

 

Number of Dimensions (ndim): 2

Shape of Array (shape): (2, 3)

Size of Array (size): 6

Data Type of Elements (dtype): int64

  

     Explanation

Attribute

Meaning

Example Output

ndim

Number of dimensions of array

2 (since it’s 2D)

shape

Tuple showing rows and columns

(2, 3) → 2 rows, 3 columns

size

Total number of elements

6

dtype

Data type of array elements

int64 or int32 (depends on your system)

 

 

Method II

 

Python Program: Demonstrating ndim, shape, size, dtype

 CODE:

import numpy as np

 

# 1D Array

arr1 = np.array([10, 20, 30, 40])

print("Array 1:", arr1)

print("ndim:", arr1.ndim)    # number of dimensions

print("shape:", arr1.shape)  # (4,) → 1 row, 4 columns

print("size:", arr1.size)    # total elements

print("dtype:", arr1.dtype)  # data type

 

print("-" * 50)

 

# 2D Array

arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print("Array 2:\n", arr2)

print("ndim:", arr2.ndim)    # 2D (matrix)

print("shape:", arr2.shape)  # (2,3) → 2 rows, 3 columns

print("size:", arr2.size)    # 6 elements

print("dtype:", arr2.dtype)

 

print("-" * 50)

 

# 3D Array

arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

print("Array 3:\n", arr3)

print("ndim:", arr3.ndim)    # 3D array

print("shape:", arr3.shape)  # (2,2,2) → 2 blocks, 2 rows, 2 cols

print("size:", arr3.size)    # 8 elements

print("dtype:", arr3.dtype)

 

OUTPUT:

Array 1: [10 20 30 40]

ndim: 1

shape: (4,)

size: 4

dtype: int64

--------------------------------------------------

Array 2:

 [[1 2 3]

  [4 5 6]]

ndim: 2

shape: (2, 3)

size: 6

dtype: int64

--------------------------------------------------

Array 3:

 [[[1 2]

   [3 4]]

 

  [[5 6]

   [7 8]]]

ndim: 3

shape: (2, 2, 2)

size: 8

dtype: int64

 

Explanation

ndim tells us if it’s 1D, 2D, or 3D.

arr1 → 1D

arr2 → 2D (matrix)

arr3 → 3D (cube/block).

shape gives dimensions:

arr1 → (4,) (4 elements, 1 row).

arr2 → (2,3) (2 rows × 3 columns).

arr3 → (2,2,2) (2 blocks × 2 rows × 2 columns).

size = total number of elements (product of shape).

dtype = NumPy automatically chooses efficient type (int64, float32, etc.).

 

 

Program 4: Python program to demonstrate basic slicing, integer and Boolean indexing.

 Method  I

CODE:

# Import NumPy library

import numpy as np

 

# Create a 1D NumPy array

arr = np.array([10, 20, 30, 40, 50, 60, 70])

 

print("Original Array:")

print(arr)

 

# 1 Basic Slicing

print("\n1. Basic Slicing Examples:")

print("Elements from index 1 to 4:", arr[1:5])   # 20 to 50

print("Elements from start to 3:", arr[:4])      # 10 to 40

print("Elements from index 3 to end:", arr[3:])  # 40 to 70

print("Every second element:", arr[::2])         # 10, 30, 50, 70

 

# 2 Integer Indexing

print("\n2. Integer Indexing Examples:")

indices = [0, 2, 5]

print("Elements at positions 0, 2, 5:", arr[indices])  # 10, 30, 60

 

# 3 Boolean Indexing

print("\n3. Boolean Indexing Examples:")

bool_mask = arr > 40

print("Boolean Mask (arr > 40):", bool_mask)

print("Elements greater than 40:", arr[bool_mask])

 

OUTPUT:

Original Array:

[10 20 30 40 50 60 70]

 

1. Basic Slicing Examples:

Elements from index 1 to 4: [20 30 40 50]

Elements from start to 3: [10 20 30 40]

Elements from index 3 to end: [40 50 60 70]

Every second element: [10 30 50 70]

 

2. Integer Indexing Examples:

Elements at positions 0, 2, 5: [10 30 60]

 

3. Boolean Indexing Examples:

Boolean Mask (arr > 40): [False False False False  True  True  True]

Elements greater than 40: [50 60 70]

 

 Explanation

Concept

Description

Example

Basic Slicing

Selects continuous elements using start:end:step

arr[1:5] → 20 30 40 50

Integer Indexing

Selects elements at specific positions

arr[[0, 2, 5]] → 10 30 60

Boolean Indexing

Uses True/False array to filter elements

arr[arr > 40] → 50 60 70

 

 

 Method  II

 CODE:

import numpy as np

 

# Create a NumPy array

arr = np.array([10, 20, 30, 40, 50, 60, 70])

 

print("Original Array:")

print(arr)

 

# ---------------------------

# 1.Basic Slicing

# ---------------------------

# Get elements from index 2 to 5 (5 excluded)

slice1 = arr[2:5]

print("\nBasic Slicing arr[2:5]:", slice1)

 

# Get every 2nd element

slice2 = arr[::2]

print("Basic Slicing arr[::2] (every 2nd element):", slice2)

 

# ---------------------------

# 2.Integer Indexing

# ---------------------------

# Access multiple elements using a list of indices

indices = [1, 3, 5]

int_indexed = arr[indices]

print("\nInteger Indexing arr[[1,3,5]]:", int_indexed)

 

# ---------------------------

# 3.Boolean Indexing

# ---------------------------

# Create a Boolean condition

bool_indexed = arr[arr > 30]  # all elements greater than 30

print("\nBoolean Indexing arr[arr > 30]:", bool_indexed)

 

 

OUTPUT:

less

Copy code

Original Array:

[10 20 30 40 50 60 70]

 

Basic Slicing arr[2:5]: [30 40 50]

Basic Slicing arr[::2] (every 2nd element): [10 30 50 70]

 

Integer Indexing arr[[1,3,5]]: [20 40 60]

 

Boolean Indexing arr[arr > 30]: [40 50 60 70]

 

Explanation

1.      Basic Slicing

o    arr[start:end] → selects elements from start to end-1.

o    arr[start:end:step] → selects elements with a step size.

2.      Integer Indexing

o    You can pass a list of indices to access multiple elements at once.

o    Example: arr[[1,3,5]] selects 2nd, 4th, and 6th elements.

3.      Boolean Indexing

o    You can create a condition that returns a Boolean array, and use it to filter elements.

o    Example: arr[arr > 30] selects all elements greater than 30.

 

 

Program 5: Python program to find min, max, sum, cumulative sum of array

 

CODE:

import numpy as np

 

# Create a NumPy array

arr = np.array([10, 20, 30, 40, 50])

 

print("Original Array:")

print(arr)

 

# Minimum value

min_val = np.min(arr)

print("\nMinimum value:", min_val)

 

# Maximum value

max_val = np.max(arr)

print("Maximum value:", max_val)

 

# Sum of all elements

sum_val = np.sum(arr)

print("Sum of elements:", sum_val)

 

# Cumulative sum

cum_sum = np.cumsum(arr)

print("Cumulative sum:", cum_sum)

 

OUTPUT:

yaml

Copy code

Original Array:

[10 20 30 40 50]

 

Minimum value: 10

Maximum value: 50

Sum of elements: 150

Cumulative sum: [ 10  30  60 100 150]

 

Explanation

np.min(arr) → Returns the smallest element in the array.

 

np.max(arr) → Returns the largest element in the array.

 

np.sum(arr) → Returns the sum of all elements.

 

np.cumsum(arr) → Returns the cumulative sum, i.e., running total of elements.

          

Program 6: Create a dictionary with at least five keys and each key represent value as a

list where this list contains at least ten values and convert this dictionary as a

pandas data frame and explore the data through the data frame as follows:

a)     Apply head () function to the pandas data frame

b)     Perform various data selection operations on Data Frame

 

(a)  Apply head () function to the pandas data frame

 

CODE:

 

# Import pandas library

import pandas as pd

 

# 1Create a dictionary with 5 keys and 10 values each

student_data = {

    'Name': ['Asha', 'Ravi', 'Kiran', 'Maya', 'John', 'Lina', 'Raj', 'Sara', 'Tom', 'Anu'],

    'Age': [18, 19, 20, 18, 21, 22, 19, 20, 18, 21],

    'Marks_Math': [78, 85, 92, 67, 88, 90, 76, 82, 95, 80],

    'Marks_Science': [82, 79, 88, 91, 73, 85, 89, 77, 94, 80],

    'City': ['Delhi', 'Mumbai', 'Chennai', 'Kolkata', 'Delhi', 'Pune', 'Hyderabad', 'Bangalore', 'Kochi', 'Jaipur']

}

 

# 2 Convert dictionary to a pandas DataFrame

df = pd.DataFrame(student_data)

 

# 3 Display the complete DataFrame

print("Complete DataFrame:")

print(df)

 

# 4 Apply head() function to display first 5 rows

print("\nFirst 5 Rows using head():")

print(df.head())

 

OUTPUT:

Complete DataFrame:

    Name  Age  Marks_Math  Marks_Science       City

0   Asha   18          78             82      Delhi

1   Ravi   19          85             79     Mumbai

2  Kiran   20          92             88    Chennai

3   Maya   18          67             91    Kolkata

4   John   21          88             73      Delhi

5   Lina   22          90             85       Pune

6    Raj   19          76             89  Hyderabad

7   Sara   20          82             77  Bangalore

8    Tom   18          95             94      Kochi

9    Anu   21          80             80     Jaipur

 

First 5 Rows using head():

    Name  Age  Marks_Math  Marks_Science     City

0   Asha   18          78             82    Delhi

1   Ravi   19          85             79   Mumbai

2  Kiran   20          92             88  Chennai

3   Maya   18          67             91  Kolkata

4   John   21          88             73    Delhi

 

(b)   Perform various data selection operations on Data Frame

 

CODE:

 

# Import pandas library

import pandas as pd

 

# 1 Create a dictionary with 5 keys and 10 values each

student_data = {

    'Name': ['Asha', 'Ravi', 'Kiran', 'Maya', 'John', 'Lina', 'Raj', 'Sara', 'Tom', 'Anu'],

    'Age': [18, 19, 20, 18, 21, 22, 19, 20, 18, 21],

    'Marks_Math': [78, 85, 92, 67, 88, 90, 76, 82, 95, 80],

    'Marks_Science': [82, 79, 88, 91, 73, 85, 89, 77, 94, 80],

    'City': ['Delhi', 'Mumbai', 'Chennai', 'Kolkata', 'Delhi', 'Pune', 'Hyderabad', 'Bangalore', 'Kochi', 'Jaipur']

}

 

# 2 Convert dictionary into a pandas DataFrame

df = pd.DataFrame(student_data)

 

# Display the complete DataFrame

print("Complete DataFrame:")

print(df)

 

# 3 Explore the data

print("\nFirst 5 rows using head():")

print(df.head())

 

# 4 Perform various Data Selection Operations

 

# a) Select a single column

print("\n(a) Selecting a single column (Marks_Math):")

print(df['Marks_Math'])

 

# b) Select multiple columns

print("\n(b) Selecting multiple columns (Name, City, Marks_Science):")

print(df[['Name', 'City', 'Marks_Science']])

 

# c) Select a specific row using loc (by label)

print("\n(c) Selecting a specific row using loc (row index 2):")

print(df.loc[2])

 

# d) Select a specific row using iloc (by position)

print("\n(d) Selecting a specific row using iloc (row position 4):")

print(df.iloc[4])

 

OUTPUT:

Complete DataFrame:

   Name  Age  Marks_Math  Marks_Science       City

0   Asha   18          78             82                      Delhi

1   Ravi   19          85             79                       Mumbai

2  Kiran   20          92             88                      Chennai

3   Maya   18          67             91                      Kolkata

4   John   21          88             73                        Delhi

5   Lina   22          90             85                         Pune

6    Raj   19          76             89                       Hyderabad

7   Sara   20          82             77                       Bangalore

8    Tom   18          95             94                     Kochi

9    Anu   21          80             80                      Jaipur

 

First 5 rows using head():

    Name  Age  Marks_Math  Marks_Science     City

0   Asha   18          78             82    Delhi

1   Ravi   19          85             79   Mumbai

2  Kiran   20          92             88  Chennai

3   Maya   18          67             91  Kolkata

4   John   21          88             73    Delhi

 

(a) Selecting a single column (Marks_Math):

0    78

1    85

2    92

3    67

4    88

5    90

6    76

7    82

8    95

9    80

Name: Marks_Math, dtype: int64

 

(b) Selecting multiple columns (Name, City, Marks_Science):

    Name       City  Marks_Science

0   Asha      Delhi             82

1   Ravi     Mumbai             79

2  Kiran    Chennai             88

3   Maya    Kolkata             91

4   John      Delhi             73

5   Lina       Pune             85

6    Raj  Hyderabad             89

7   Sara  Bangalore             77

8    Tom      Kochi             94

9    Anu     Jaipur             80

 

(c) Selecting a specific row using loc (row index 2):

Name               Kiran

Age                   20

Marks_Math            92

Marks_Science         88

City             Chennai

Name: 2, dtype: object

 

(d) Selecting a specific row using iloc (row position 4):

Name              John

Age                 21

Marks_Math          88

Marks_Science       73

City             Delhi

Name: 4, dtype: object

 

 

 Program 7: Select any two columns from the above data frame, and observe the change

in one attribute with respect to other attribute with scatter and plot operations

in matplotlib

 

CODE:

 

# Import necessary libraries

import pandas as pd

import matplotlib.pyplot as plt

 

# 1 Create a dictionary with sample student data

student_data = {

    'Name': ['Asha', 'Ravi', 'Kiran', 'Maya', 'John', 'Lina', 'Raj', 'Sara', 'Tom', 'Anu'],

    'Age': [18, 19, 20, 18, 21, 22, 19, 20, 18, 21],

    'Marks_Math': [78, 85, 92, 67, 88, 90, 76, 82, 95, 80],

    'Marks_Science': [82, 79, 88, 91, 73, 85, 89, 77, 94, 80],

    'City': ['Delhi', 'Mumbai', 'Chennai', 'Kolkata', 'Delhi', 'Pune', 'Hyderabad', 'Bangalore', 'Kochi', 'Jaipur']

}

 

# 2 Convert dictionary into a pandas DataFrame

df = pd.DataFrame(student_data)

 

# Display the DataFrame

print("Student DataFrame:")

print(df)

 

# 3 Select two columns for visualization

x = df['Marks_Math']

y = df['Marks_Science']

 

# 4 Create a Scatter Plot

plt.scatter(x, y, color='blue', marker='o')

plt.title("Scatter Plot: Marks in Math vs Science")

plt.xlabel("Marks in Math")

plt.ylabel("Marks in Science")

plt.grid(True)

plt.show()

 

# 5 Create a Line Plot (Plot Operation)

plt.plot(x, y, color='green', linestyle='--', marker='o')

plt.title("Line Plot: Marks in Math vs Science")

plt.xlabel("Marks in Math")

plt.ylabel("Marks in Science")

plt.grid(True)

plt.show()

 

OUTPUT:

 

Student DataFrame:

    Name  Age  Marks_Math  Marks_Science       City

0   Asha   18          78             82      Delhi

1   Ravi   19          85             79     Mumbai

2  Kiran   20          92             88    Chennai

3   Maya   18          67             91    Kolkata

4   John   21          88             73      Delhi

5   Lina   22          90             85       Pune

6    Raj   19          76             89  Hyderabad

7   Sara   20          82             77  Bangalore

8    Tom   18          95             94      Kochi

9    Anu   21          80             80     Jaipur

 



 

 


 


 

Explanation:

Step

Function

Description

plt.scatter(x, y)

Creates a scatter plot

Shows how one variable changes with another

plt.plot(x, y)

Creates a line plot

Connects data points with lines

xlabel(), ylabel()

Label axes

Gives context to the chart

title()

Adds a title

Describes what the plot represents

plt.show()

Displays the plot window

Shows the graph

 

***************************END***************************

 


PP UNIT IV & WEEK IV PROGRAMS

  UNIT-IV Files: Types of Files, Creating and Reading Text Data, File Methods to Read and Write Data, Reading and Writing Binary Files, Pi...