Pandas How to Not Reading First Column as Index in Excel
This article is part of the Transition from Excel to Python series. We take walked through the data i/o (reading and saving files) part. Allow'due south movement on to something more interesting. In Excel, we can see the rows, columns, and cells. We can reference the values by using a "=" sign or inside a formula. In Python, the data is stored in computer memory (i.e., non directly visible to the users), luckily the pandas library provides easy means to get values, rows, and columns.
Let's first gear up a dataframe, so we have something to work with. We'll utilize this example file from earlier, and we can open the Excel file on the side for reference.
import pandas as pd df = pd.read_excel('users.xlsx') >>> df User Proper noun Country City Gender Age 0 Forrest Gump Us New York K l one Mary Jane CANADA Tornoto F xxx ii Harry Porter UK London Chiliad 20 3 Jean Greyness CHINA Shanghai F thirty
Some observations almost this small tabular array/dataframe:
- At that place are five columns with names: "User Name", "Country", "City", "Gender", "Age"
- In that location are 4 rows (excluding the header row)
df.index returns the list of the index, in our case, it's simply integers 0, 1, two, 3.
df.columns gives the list of the column (header) names.
df.shape shows the dimension of the dataframe, in this case information technology's 4 rows by 5 columns.
>>> df.alphabetize RangeIndex(starting time=0, stop=iv, step=1) >>> df.columns Index(['User Name', 'Land', 'City', 'Gender', 'Age'], dtype='object') >>> df.shape (4, five) pandas get columns
There are several ways to go columns in pandas. Each method has its pros and cons, then I would employ them differently based on the situation.
The dot annotation
We tin can type df.Country to become the "Country" column. This is a quick and piece of cake way to get columns. However, if the cavalcade name contains space, such as "User Proper noun". This method volition not work.
>>> df.Country 0 The states i CANADA 2 United kingdom of great britain and northern ireland 3 CHINA Name: Country, dtype: object >>> df.Historic period 0 50 ane 30 ii xx iii thirty Proper noun: Historic period, dtype: int64 >>> df.User Name SyntaxError: invalid syntax Square brackets notation
This is my personal favorite. It requires a dataframe name and a column name, which goes like this: dataframe[column name] . The column name inside the foursquare brackets is a string, and so we take to employ quotation effectually it. Although it requires more than typing than the dot annotation, this method will e'er piece of work in any cases. Because we wrap around the string (column name) with a quote, names with spaces are also allowed here.
>>> df['User Name'] 0 Forrest Gump i Mary Jane ii Harry Porter 3 Jean Grayness Name: User Name, dtype: object >>> df['Urban center'] 0 New York 1 Tornoto 2 London 3 Shanghai Name: City, dtype: object Become multiple columns
The square bracket annotation makes getting multiple columns easy. The syntax is similar, but instead, we laissez passer a list of strings into the square brackets. Pay attention to the double square brackets:
dataframe[ [column name i, column name 2, cavalcade name 3, ... ] ]
>>> df[['User Name', 'Age', 'Gender']] User Name Age Gender 0 Forrest Gump 50 Yard ane Mary Jane thirty F 2 Harry Porter 20 M iii Jean Grey 30 F pandas get rows
Nosotros tin can use .loc[] to get rows. Note the foursquare brackets hither instead of the parenthesis (). The syntax is like this: df.loc[row, column] . column is optional, and if left blank, we can become the entire row. Because Python uses a nix-based index, df.loc[0] returns the get-go row of the dataframe.
Get i row
>>> df.loc[0] User Name Forrest Gump Land USA City New York Gender Chiliad Historic period 50 Name: 0, dtype: object >>> df.loc[two] User Name Harry Porter Land UK City London Gender M Age 20 Name: 2, dtype: object Get multiple rows
We'll have to employ indexing/slicing to get multiple rows. In pandas, this is done similar to how to index/slice a Python list.
To go the outset three rows, we can do the following:
>>> df.loc[0:2] User Name Country Urban center Gender Age 0 Forrest Gump USA New York M 50 1 Mary Jane CANADA Tornoto F 30 2 Harry Porter Great britain London M 20 pandas get prison cell values
To get individual cell values, we demand to use the intersection of rows and columns. Retrieve about how nosotros reference cells within Excel, like a jail cell "C10", or a range "C10:E20". The follow two approaches both follow this row & column thought.
Square brackets notation
Using the square brackets note, the syntax is like this: dataframe[column name][row alphabetize] . This is sometimes chosen chained indexing. An easier manner to remember this notation is: dataframe[cavalcade name] gives a column, so adding another [row index] will give the specific item from that cavalcade.
Let'south say we desire to go the Metropolis for Mary Jane (on row 2).
>>> df['Urban center'][1] 'Tornoto' To go the 2nd and the 4th row, and only the User Proper noun, Gender and Historic period columns, we tin can pass the rows and columns equally two lists like the below.
>>> df[['User Proper name', 'Age', 'Gender']].loc[[i,iii]] User Proper noun Historic period Gender 1 Mary Jane 30 F 3 Jean Grey 30 F Call up, df[['User Proper name', 'Age', 'Gender']] returns a new dataframe with only three columns. And so .loc[ [ 1,3 ] ] returns the 1st and 4th rows of that dataframe.
.loc[] method
As previously mentioned, the syntax for .loc is df.loc[row, column] . Need a reminder on what are the possible values for rows (index) and columns?
>>> df.index RangeIndex(start=0, terminate=4, footstep=1) >>> df.columns Index(['User Name', 'Country', 'City', 'Gender', 'Age'], dtype='object') Let'south endeavor to get the country name for Harry Porter, who's on row 3.
>>> df.loc[two,'Country'] 'UK' To get the 2d and the 4th row, and just the User Proper noun, Gender and Age columns, we can pass the rows and columns as two lists into the "row" and "column" positional arguments.
>>> df.loc[[i,three],['User Name', 'Historic period', 'Gender']] User Name Age Gender 1 Mary Jane 30 F three Jean Grayness 30 F Source: https://pythoninoffice.com/get-values-rows-and-columns-in-pandas-dataframe/
0 Response to "Pandas How to Not Reading First Column as Index in Excel"
Publicar un comentario