{"id":3683,"date":"2023-09-10T11:20:34","date_gmt":"2023-09-10T11:20:34","guid":{"rendered":"https:\/\/www.copahost.com\/blog\/?p=3683"},"modified":"2023-09-10T11:20:39","modified_gmt":"2023-09-10T11:20:39","slug":"pandas-python","status":"publish","type":"post","link":"https:\/\/www.copahost.com\/blog\/pandas-python\/","title":{"rendered":"Pandas python: How to analyze data with pandas in Python"},"content":{"rendered":"\n<p>Pandas is a&nbsp;<strong>Python library that lets you work with data sifted into Excel or CSV spreadsheets<\/strong>&nbsp;.&nbsp;It is one of the most popular and widely used Python libraries, and is especially useful for anyone working in data analysis.&nbsp;Thus, with pandas, it is possible&nbsp;<strong>to manage, manipulate and analyze data in an easy and fast way<\/strong>&nbsp;, making the data analysis process more efficient and effective.<\/p>\n\n\n\n<p><strong>In this article, you&#8217;ll learn everything you need to know to get started with pandas, from getting started to creating compelling visualizations<\/strong>&nbsp;.&nbsp;Let&#8217;s start!<\/p>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_69_1 ez-toc-wrap-center counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Syntax\" title=\"Syntax\">Syntax<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Main_features_of_Pandas_in_Python\" title=\"Main features of Pandas in Python\">Main features of Pandas in Python<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Data_Management_in_Pandas\" title=\"Data Management in Pandas\">Data Management in Pandas<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Working_with_Cells_in_Pandas_in_Python\" title=\"Working with Cells in Pandas in Python\">Working with Cells in Pandas in Python<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Working_with_Columns_in_Pandas_in_Python\" title=\"Working with Columns in Pandas in Python\">Working with Columns in Pandas in Python<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Working_with_Lines_in_Pandas_in_Python\" title=\"Working with Lines in Pandas in Python\">Working with Lines in Pandas in Python<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Applying_Classifiers_with_Pandas_in_python\" title=\"Applying Classifiers with Pandas in python\">Applying Classifiers with Pandas in python<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Data_aggregation_in_Pandas\" title=\"Data aggregation in Pandas\">Data aggregation in Pandas<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Examples_using_Pandas_in_conjunction_with_other_functions_in_python\" title=\"Examples using Pandas in conjunction with other functions in python\">Examples using Pandas in conjunction with other functions in python<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/#Working_with_tables_and_charts_in_Pandas\" title=\"Working with tables and charts in Pandas\">Working with tables and charts in Pandas<\/a><\/li><\/ul><\/nav><\/div>\n\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Syntax\"><\/span>Syntax<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>The pandas syntax is&nbsp;<strong>based on DataFrames<\/strong>&nbsp;, which are Python objects that allow you to manage, manipulate and parse data.&nbsp;In this way, pandas DataFrames are similar to Excel spreadsheets, with the differences being that the data is stored in Python and can be manipulated more flexibly.<\/p>\n\n\n\n<p>To create a DataFrame in pandas, you&nbsp;<strong>first need to load the data into a Python object<\/strong>&nbsp;.&nbsp;Here&#8217;s an example of how to load a CSV file into a DataFrame:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n# Load a CSV file into a DataFrame\ndf = pd.read_csv(\"data.csv\")<\/code><\/pre>\n\n\n\n<p>Once the DataFrame is created, you can use a number of tools and methods to manage and analyze the data.&nbsp;For example, you can use the method&nbsp;&nbsp;<code><strong>head()<\/strong><\/code>&nbsp;to display the first 10 rows of the DataFrame, or the method&nbsp;&nbsp;<code><strong>describe()<\/strong><\/code>&nbsp;to display statistics for that DataFrame.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Display the first 10 rows of the DataFrame\ndf.head()\n\n# Display DataFrame statistics\ndf.describe()<\/code><\/pre>\n\n\n\n<p>In addition to these basic methods, pandas offers a number of other methods and tools for managing and analyzing data, such as filters, aggregations and visualizations. In this regard, here is an example of how to apply a filter to the DataFrame:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Apply a filter to the DataFrame\ndf&#091;df&#091;\"age\"] &gt;= 30]<\/code><\/pre>\n\n\n\n<p>This is just an example of pandas syntax.&nbsp;To learn more about the available tools and methods, continue learning in this article and consult the official pandas documentation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Main_features_of_Pandas_in_Python\"><\/span>Main features of Pandas in Python<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Data management in pandas is one of the main features of the library.&nbsp;In this way, allowing to work with data in a table format, facilitating the analysis and manipulation of the data.&nbsp;Some of pandas main functions for data management include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Creating DataFrames:<\/strong>&nbsp;It is possible to create a DataFrame from an array or a list of lists.&nbsp;The DataFrame can contain multiple columns and rows, and each cell can contain a numeric or categorical value.<\/li>\n\n\n\n<li><strong>Reading and writing files:<\/strong>&nbsp;Pandas supports reading and writing several file formats, including CSV, Excel, JSON, SQL and more.&nbsp;Thus, making it possible to import and export data from other data sources.<\/li>\n\n\n\n<li><strong>Filters:<\/strong>&nbsp;It is possible to filter the data in a DataFrame using a boolean expression.&nbsp;So this allows you to select only the rows or columns that interest you.<\/li>\n\n\n\n<li><strong>Aggregation:<\/strong>&nbsp;Pandas offers several functions to aggregate data in a DataFrame, such as mean, standard deviation, count and more.&nbsp;In this sense, allowing to obtain a better understanding of the data and identify trends.<\/li>\n\n\n\n<li><strong>Resampling:<\/strong>&nbsp;Pandas supports data resampling, that is, you can split data into samples for training and testing machine learning models.<\/li>\n\n\n\n<li><strong>Data manipulation:<\/strong>&nbsp;Pandas provides several functions to manipulate data in a DataFrame, such as removing rows or columns, renaming columns, sorting and more.<\/li>\n<\/ul>\n\n\n\n<p>These are just some of pandas&#8217; features for data management.&nbsp;Thus, the library is quite powerful and offers many other features to work with data efficiently and intuitively.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Data_Management_in_Pandas\"><\/span>Data Management in Pandas<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Pandas is a Python library that lets you work with data in a table format.&nbsp;As such, it is quite popular due to its ease of use and its ability to handle various types of data, including numerical and categorical data.&nbsp;Therefore, to work with pandas, it is important to understand how to work with the cells, rows and columns of a DataFrame.<\/p>\n\n\n\n<p>For example, if you have a DataFrame with 3 rows and 4 columns, the first cell (in the upper left corner) will have index 0, the second cell will have index 1, and so on.&nbsp;The cell in row 2, column 3 will be accessed by index 2 as it is the third cell in the DataFrame.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Working_with_Cells_in_Pandas_in_Python\"><\/span>Working with Cells in Pandas in Python<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Cells are the individual elements of a DataFrame.&nbsp;This way, each cell contains a unique value and can be accessed by its position in the table.&nbsp;Thus, the position of a cell is specified by its index, which is a zero or positive number that indicates the row and column in which the cell is located.<\/p>\n\n\n\n<p>To delete a cell from a DataFrame, just use the method&nbsp;&nbsp;<code><strong>drop()<\/strong><\/code>&nbsp;and specify the index of the cell you want to delete.&nbsp;For example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n# create a new DataFrame with 3 rows and 4 columns\ndf = pd.DataFrame({'A': &#091;1, 2, 3], 'B': &#091;4, 5, 6], 'C': &#091;7, 8, 9]})\n\n# delete the cell in row 1, column 2\ndf = df.drop(0, axis=0)\n\n# print the resulting DataFrame\nprint(df)<\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>   A   B   C\n2  3   6   9<\/code><\/pre>\n\n\n\n<p>To change the value of a cell, just access the cell by its index and assign a new value to it.&nbsp;For example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># change the cell value in row 2, column 1 to 10\ndf.iloc&#091;1, 0] = 10\n\n# print the resulting DataFrame\nprint(df)<\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>   A   B   C\n0  1   4   7\n1  2  10   8\n2  3   6   9<\/code><\/pre>\n\n\n\n<p>To get the value of a cell, just access the cell by its index.&nbsp;For example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># get the cell value in row 2, column 1\nvalue = df.iloc&#091;1, 0]\n\n# print the value\nprint(value)<\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>10<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Working_with_Columns_in_Pandas_in_Python\"><\/span>Working with Columns in Pandas in Python<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Columns are the basic elements of a table in pandas.&nbsp;Thus, each column represents a variable or attribute and contains a set of unique values.&nbsp;To work with columns, you can use the DataFrame class.&nbsp;A DataFrame is a table of data consisting of rows and columns.<\/p>\n\n\n\n<p>To work with columns in pandas, you can use several resources.&nbsp;Some of the most common features include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get Column Name: You can use attribute name to get column name.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.columns)<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get the number of columns: you can use the&nbsp;<strong>shape<\/strong>&nbsp;method and access the second element to get the number of columns.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.shape&#091;1])<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get the first value of a column: you can use the<strong> iloc()<\/strong>&nbsp;method&nbsp;and provide the column index and row index.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.iloc&#091;0, 0])<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>To get all values \u200b\u200bfrom a column: you can use the&nbsp;<strong>locals()<\/strong>&nbsp;method and provide the column index.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df&#091;\"col1\"])<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get a list of all column names: you can use the&nbsp;<strong>columns.<\/strong><br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.columns.tolist())<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get a list with the unique values \u200b\u200bof a column: you can use<strong> unique()<\/strong>&nbsp;method<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df&#091;\"col1\"].unique())<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get number of rows and columns: you can use&nbsp;<strong>shape<\/strong><br>method Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.shape)<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get the index of a column: you can use<strong> set_index()<\/strong>&nbsp;method<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>df.set_index(\"col1\", inplace=True)<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Working_with_Lines_in_Pandas_in_Python\"><\/span>Working with Lines in Pandas in Python<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Rows are the basic elements of a table in pandas.&nbsp;Each row represents an individual or observation and contains a set of values \u200b\u200bfor each variable.&nbsp;Therefore, to work with lines, you can use the DataFrame class.&nbsp;In this sense, a DataFrame is a table of data consisting of rows and columns.<\/p>\n\n\n\n<p>To work with lines in pandas, you can use several resources.&nbsp;Some of the most common features include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get the number of rows: you can use<strong> shape<\/strong>&nbsp;method&nbsp;and access the first element to get the number of rows.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.shape&#091;0])<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get the value of a row: You can use the<strong> iloc()<\/strong>&nbsp;method&nbsp;and provide the row index and column index.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.iloc&#091;0, 0])<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get all rows from a column: you can use the&nbsp;<strong>locals()<\/strong>&nbsp;method and provide the column index.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df&#091;\"linha1\"])<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get a list of all row indexes: you can use the&nbsp;<strong>index<\/strong>&nbsp;method .<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.index.tolist())<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get a list with the unique values \u200b\u200bof a row: you can use<strong> unique()<\/strong>&nbsp;method<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df&#091;\"linha1\"].unique())<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get the index of a row: you can use <strong>set_index()<\/strong>&nbsp;method<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>df.set_index(\"linha1\", inplace=True)<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get first x rows: you can use&nbsp;<strong>head()<\/strong><br>method Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.head(x))<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Get last x rows: you can use&nbsp;<strong>tail()<\/strong><br>method Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>print(df.tail(x))<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Applying_Classifiers_with_Pandas_in_python\"><\/span>Applying Classifiers with Pandas in python<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Sorted is a way to group or sort the data in a DataFrame based on specific criteria.&nbsp;Thus, pandas offers several functionalities to apply classifications, such as:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>groupby(): The&nbsp;<strong>groupby()<\/strong>&nbsp;method allows you to group the rows of a DataFrame based on one or more columns.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>group = df.groupby(\"idade\")<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>agg(): The agg()<\/strong>&nbsp;method&nbsp;allows you to calculate aggregated values \u200b\u200bfor each group created by the groupby() method.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>group.agg({\"salario\": \"mean\"})<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>pivot_table(): The pivot_table()<\/strong>&nbsp;method&nbsp;allows you to create pivot tables or disaggregate data in a DataFrame.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>pd.pivot_table(df, values=\"salario\", index=\"idade\", columns=\"g\u00eanero\")<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>crosstab(): The crosstab()<\/strong>&nbsp;method&nbsp;allows you to create crosstabs or disaggregate data in a DataFrame.<br>Example:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>pd.crosstab(df&#091;\"idade\"], df&#091;\"g\u00eanero\"])<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Data_aggregation_in_Pandas\"><\/span>Data aggregation in Pandas<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Data aggregation is an important process in data analysis, which consists of combining one or more columns of data into a single column.&nbsp;Thus, in Python we can use the Pandas library to perform data aggregation operations.<\/p>\n\n\n\n<p>The Pandas library offers several functions for aggregating data.&nbsp;So, some of these functions are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code><strong>sum()<\/strong><\/code>: sums the values \u200b\u200bof all cells in a column.<\/li>\n\n\n\n<li><strong><code>mean()<\/code>:<\/strong>&nbsp;Calculates the average of the values \u200b\u200bof all cells in a column.<\/li>\n\n\n\n<li><code><strong>median()<\/strong><\/code>: calculates the average value of the values \u200b\u200bin a column, considering the average of even values \u200b\u200band the average of odd values.<\/li>\n\n\n\n<li><code><strong>min()<\/strong><\/code>: Returns the smallest value in a column.<\/li>\n\n\n\n<li><code><strong>max()<\/strong><\/code>: Returns the largest value in a column.<\/li>\n\n\n\n<li><code><strong>count()<\/strong><\/code>: Returns the number of rows in a column.<\/li>\n<\/ul>\n\n\n\n<p>In addition to these functions, the Pandas library also offers other functions for data aggregation, such as&nbsp;&nbsp;<code><strong>groupby()<\/strong><\/code>,&nbsp;&nbsp;<code><strong>merge()<\/strong><\/code>,&nbsp;&nbsp;<code><strong>join()<\/strong><\/code>, among others.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Examples_using_Pandas_in_conjunction_with_other_functions_in_python\"><\/span>Examples using Pandas in conjunction with other functions in python<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Let&#8217;s give some examples of how to use other functions with the Pandas package in Python:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Using&nbsp;&nbsp;<code><a href=\"https:\/\/www.copahost.com\/blog\/append-python\/\">append<\/a>()<\/code>:<\/li>\n<\/ol>\n\n\n\n<p>The method&nbsp;&nbsp;is used to add a row or a column to an existing DataFrame.<strong>&nbsp;<code>append()<\/code><\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n# Creating a DataFrame\ndf = pd.DataFrame({'A': &#091;1, 2, 3], 'B': &#091;'a', 'b', 'c']})\n\n# Adding a row of data to the DataFrame\ndf = df.append({'A': 4, 'B': 'd'}, ignore_index=True)\n\nprint(df)<\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>   A  B\n0  1  a\n1  2  b\n2  3  c\n3  4  d<\/code><\/pre>\n\n\n\n<ol class=\"wp-block-list\" start=\"2\">\n<li>Using null:<\/li>\n<\/ol>\n\n\n\n<p>The&nbsp;value&nbsp;<code>NaN<\/code>&nbsp;or &nbsp;&nbsp;<a href=\"https:\/\/www.copahost.com\/blog\/null-python\/\">(null value)<\/a>&nbsp;is used to indicate that data is not available.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n# Creating a DataFrame\ndf = pd.DataFrame({'A': &#091;1, 2, 3], 'B': &#091;'a', 'b', 'c']})\n\n# Adding a NaN value to the DataFrame\ndf.loc&#091;4] = pd.NaT\n\nprint(df)<\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>   A  B\n0  1  a\n1  2  b\n2  3  c\n3  NaN \n4  NaN\n<\/code><\/pre>\n\n\n\n<ol class=\"wp-block-list\" start=\"3\">\n<li>Using two functions together:<\/li>\n<\/ol>\n\n\n\n<p>The function&nbsp;&nbsp;<strong><code>apply()<\/code>&nbsp;<\/strong>is used to apply a function to each value in a column.&nbsp;The argument&nbsp;&nbsp;<code><strong>ifelse()<\/strong><\/code>&nbsp;is used to define a value to be returned based on a condition.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\nimport numpy as np\n\n# Creating a DataFrame\ndf = pd.DataFrame({'A': &#091;1, 2, 3], 'B': &#091;4, 5, 6]})\n\n# Defining a function to check whether a number is even or odd\ndef is_even_or_odd(x):\n    if x % 2 == 0:\n        return 1\n    else:\n        return 0\n\n# Applying the function to a column and returning the result\ndf&#091;'C'] = df&#091;'A'].apply(is_even_or_odd)\n\nprint(df)<\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>   A  B   C\n0  1  4    0\n1  2  5    1\n2  3  6    0<\/code><\/pre>\n\n\n\n<ol class=\"wp-block-list\" start=\"4\">\n<li>Using&nbsp;<strong>&nbsp;<code>range()<\/code><\/strong>:<\/li>\n<\/ol>\n\n\n\n<p>The modulus&nbsp;&nbsp;<code><a href=\"https:\/\/www.copahost.com\/blog\/python-range\/\">range()<\/a><\/code>&nbsp;is used to generate a sequence of numbers.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n# Creating a DataFrame\ndf = pd.DataFrame({'A': &#091;1, 2, 3]})\n\n# Generating a sequence of numbers with the range() module\nseq = range(len(df))\n\n# Creating a new column with the sequence values\ndf&#091;'B'] = sequence\n\nprint(df)<\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>   A  B\n0  1  0\n1  2  1\n2  3  2<\/code><\/pre>\n\n\n\n<p>So, in this example the sequence of numbers is generated using modulus&nbsp;&nbsp;<code><strong>range()<\/strong><\/code>, based on the number of rows (&nbsp;<code><a href=\"https:\/\/www.copahost.com\/blog\/len-python\/\">len<\/a>(df)<\/code>) in the DataFrame.&nbsp;Then we add a sequence of numbers as a new column to the DataFrame.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Working_with_tables_and_charts_in_Pandas\"><\/span>Working with tables and charts in Pandas<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Pandas is a powerful Python library for analyzing and manipulating data.&nbsp;As such, it provides several tools for creating tables and graphs to visualize your data.<\/p>\n\n\n\n<p>In that sense, to create a table with Pandas, just load a CSV file or a Python list and then use the function&nbsp;&nbsp;<code><strong>to_frame()<\/strong><\/code>&nbsp;to convert the list into a DataFrame.&nbsp;Thus, the DataFrame is a data structure similar to an Excel spreadsheet, which we can manipulate with several operations.<\/p>\n\n\n\n<p>Here&#8217;s an example of how to load a CSV file and create a table:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n# Loading the CSV file\ndf = pd.read_csv('file.csv')\n\n# Display the table\nprint(df)<\/code><\/pre>\n\n\n\n<p>In this way, Pandas also provides several formatting options for DataFrame cells, such&nbsp;<strong><code>fillna()<\/code><\/strong>&nbsp;as filling in missing values,&nbsp;<strong><code>round()<\/code><\/strong>&nbsp;rounding values,&nbsp;<code><strong>str.upper()<\/strong><\/code>&nbsp;&nbsp;converting a column to uppercase, among others.<strong>&nbsp;&nbsp;<\/strong><\/p>\n\n\n\n<p>Therefore, to create graphs, Pandas offers the function&nbsp;&nbsp;<code><strong>plot()<\/strong><\/code>, which can be used to generate different types of graphs, such as linear graphs, bars, scatters, among others.&nbsp;So, here&#8217;s an example of how to create a bar chart:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n# Loading the CSV file\ndf = pd.read_csv('file.csv')\n\n# Creating a bar chart\ndf.plot(kind='bar')\n\n# displaying the graph\nplt.show()<\/code><\/pre>\n\n\n\n<p>Pandas also allows you to customize the charts, such as changing the title, the axes labels, the size of the lines, among other options.<\/p>\n\n\n\n<p>In addition, Pandas offers the option to save the graphs as images in several formats, such as PNG, JPG, PDF, among others.&nbsp;Here is an example of how to save a graphic as a PNG image:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\n# Loading the CSV file\ndf = pd.read_csv('file.csv')\n\n# Creating a bar chart\ndf.plot(kind='bar', filename='bar_chart.png')\n\n# Closing the chart window\nplt.close()<\/code><\/pre>\n\n\n\n<p>Therefore, to present your results, the use of visualizations is fundamental.&nbsp;Thus, visualizations help to convey the results in a clear and concise way, making the data easier to understand.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pandas is a&nbsp;Python library that lets you work with data sifted into Excel or CSV spreadsheets&nbsp;.&nbsp;It is one of the most popular and widely used Python libraries, and is especially useful for anyone working in data analysis.&nbsp;Thus, with pandas, it is possible&nbsp;to manage, manipulate and analyze data in an easy and fast way&nbsp;, making the [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":3692,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[174],"tags":[],"class_list":["post-3683","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.8 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Pandas python: How to analyze data with pandas in Python - Copahost<\/title>\n<meta name=\"description\" content=\"pandas is the perfect solution for working with data in Python! With it, you can manage, manipulate and analyze data quickly.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.copahost.com\/blog\/pandas-python\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Pandas python: How to analyze data with pandas in Python - Copahost\" \/>\n<meta property=\"og:description\" content=\"pandas is the perfect solution for working with data in Python! With it, you can manage, manipulate and analyze data quickly.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.copahost.com\/blog\/pandas-python\/\" \/>\n<meta property=\"og:site_name\" content=\"Copahost\" \/>\n<meta property=\"article:published_time\" content=\"2023-09-10T11:20:34+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-09-10T11:20:39+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1075\" \/>\n\t<meta property=\"og:image:height\" content=\"969\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Schenia T\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Schenia T\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/\"},\"author\":{\"name\":\"Schenia T\",\"@id\":\"https:\/\/www.copahost.com\/blog\/#\/schema\/person\/2efb96f9dfaf6162f347abcd06b1429f\"},\"headline\":\"Pandas python: How to analyze data with pandas in Python\",\"datePublished\":\"2023-09-10T11:20:34+00:00\",\"dateModified\":\"2023-09-10T11:20:39+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/\"},\"wordCount\":1972,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png\",\"articleSection\":[\"Python\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.copahost.com\/blog\/pandas-python\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/\",\"url\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/\",\"name\":\"Pandas python: How to analyze data with pandas in Python - Copahost\",\"isPartOf\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png\",\"datePublished\":\"2023-09-10T11:20:34+00:00\",\"dateModified\":\"2023-09-10T11:20:39+00:00\",\"description\":\"pandas is the perfect solution for working with data in Python! With it, you can manage, manipulate and analyze data quickly.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.copahost.com\/blog\/pandas-python\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/#primaryimage\",\"url\":\"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png\",\"contentUrl\":\"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png\",\"width\":1075,\"height\":969,\"caption\":\"pandas python\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.copahost.com\/blog\/pandas-python\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.copahost.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Pandas python: How to analyze data with pandas in Python\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.copahost.com\/blog\/#website\",\"url\":\"https:\/\/www.copahost.com\/blog\/\",\"name\":\"Copahost\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.copahost.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.copahost.com\/blog\/#organization\",\"name\":\"Copahost\",\"url\":\"https:\/\/www.copahost.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.copahost.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2016\/03\/copahostlogo.png\",\"contentUrl\":\"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2016\/03\/copahostlogo.png\",\"width\":223,\"height\":40,\"caption\":\"Copahost\"},\"image\":{\"@id\":\"https:\/\/www.copahost.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.copahost.com\/blog\/#\/schema\/person\/2efb96f9dfaf6162f347abcd06b1429f\",\"name\":\"Schenia T\",\"description\":\"Data scientist, passionate about technology tools and games. Undergraduate student in Statistics at UFPB. Her hobby is binge-watching series, enjoying good music working or cooking, going to the movies and learning new things!\",\"url\":\"https:\/\/www.copahost.com\/blog\/author\/schenia\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Pandas python: How to analyze data with pandas in Python - Copahost","description":"pandas is the perfect solution for working with data in Python! With it, you can manage, manipulate and analyze data quickly.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.copahost.com\/blog\/pandas-python\/","og_locale":"en_US","og_type":"article","og_title":"Pandas python: How to analyze data with pandas in Python - Copahost","og_description":"pandas is the perfect solution for working with data in Python! With it, you can manage, manipulate and analyze data quickly.","og_url":"https:\/\/www.copahost.com\/blog\/pandas-python\/","og_site_name":"Copahost","article_published_time":"2023-09-10T11:20:34+00:00","article_modified_time":"2023-09-10T11:20:39+00:00","og_image":[{"width":1075,"height":969,"url":"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png","type":"image\/png"}],"author":"Schenia T","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Schenia T","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/#article","isPartOf":{"@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/"},"author":{"name":"Schenia T","@id":"https:\/\/www.copahost.com\/blog\/#\/schema\/person\/2efb96f9dfaf6162f347abcd06b1429f"},"headline":"Pandas python: How to analyze data with pandas in Python","datePublished":"2023-09-10T11:20:34+00:00","dateModified":"2023-09-10T11:20:39+00:00","mainEntityOfPage":{"@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/"},"wordCount":1972,"commentCount":0,"publisher":{"@id":"https:\/\/www.copahost.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/#primaryimage"},"thumbnailUrl":"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png","articleSection":["Python"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.copahost.com\/blog\/pandas-python\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/","url":"https:\/\/www.copahost.com\/blog\/pandas-python\/","name":"Pandas python: How to analyze data with pandas in Python - Copahost","isPartOf":{"@id":"https:\/\/www.copahost.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/#primaryimage"},"image":{"@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/#primaryimage"},"thumbnailUrl":"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png","datePublished":"2023-09-10T11:20:34+00:00","dateModified":"2023-09-10T11:20:39+00:00","description":"pandas is the perfect solution for working with data in Python! With it, you can manage, manipulate and analyze data quickly.","breadcrumb":{"@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.copahost.com\/blog\/pandas-python\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/#primaryimage","url":"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png","contentUrl":"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2023\/09\/pandas-artigo.png","width":1075,"height":969,"caption":"pandas python"},{"@type":"BreadcrumbList","@id":"https:\/\/www.copahost.com\/blog\/pandas-python\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.copahost.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Pandas python: How to analyze data with pandas in Python"}]},{"@type":"WebSite","@id":"https:\/\/www.copahost.com\/blog\/#website","url":"https:\/\/www.copahost.com\/blog\/","name":"Copahost","description":"","publisher":{"@id":"https:\/\/www.copahost.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.copahost.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.copahost.com\/blog\/#organization","name":"Copahost","url":"https:\/\/www.copahost.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.copahost.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2016\/03\/copahostlogo.png","contentUrl":"https:\/\/www.copahost.com\/blog\/wp-content\/uploads\/2016\/03\/copahostlogo.png","width":223,"height":40,"caption":"Copahost"},"image":{"@id":"https:\/\/www.copahost.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.copahost.com\/blog\/#\/schema\/person\/2efb96f9dfaf6162f347abcd06b1429f","name":"Schenia T","description":"Data scientist, passionate about technology tools and games. Undergraduate student in Statistics at UFPB. Her hobby is binge-watching series, enjoying good music working or cooking, going to the movies and learning new things!","url":"https:\/\/www.copahost.com\/blog\/author\/schenia\/"}]}},"_links":{"self":[{"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/posts\/3683","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/comments?post=3683"}],"version-history":[{"count":4,"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/posts\/3683\/revisions"}],"predecessor-version":[{"id":3696,"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/posts\/3683\/revisions\/3696"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/media\/3692"}],"wp:attachment":[{"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/media?parent=3683"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/categories?post=3683"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.copahost.com\/blog\/wp-json\/wp\/v2\/tags?post=3683"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}