示例数据,假设我们有一个DataFrame对象,如下:
import pandas as pd df = pd.DataFrame({ "name": ["Alice", "Bob", "Charlie", "David"], "age": [25, 30, 35, 40], "gender": ["F", "M", "M", "M"] }) print(df) # 输出: # name age gender # 0 Alice 25 F # 1 Bob 30 M # 2 Charlie 35 M # 3 David 40 M
df = pd.read_excel("data.xlsx", index_col="date")
在读取文件时,我们可以指定索引,上面代码指定了"date"这一列为行索引
# 修改行索引值 df.index = ["a", "b", "c", "d"] print(df) # 输出: # name age gender # a Alice 25 F # b Bob 30 M # c Charlie 35 M # d David 40 M # 修改列索引值 df.columns = ["姓名", "年龄", "性别"] print(df) # 输出: # 姓名 年龄 性别 # a Alice 25 F # b Bob 30 M # c Charlie 35 M # d David 40 M
# 修改行索引值 df = df.rename({"a": "A", "b": "B", "c": "C", "d": "D"}, axis=0) print(df) # 输出: # 姓名 年龄 性别 # A Alice 25 F # B Bob 30 M # C Charlie 35 M # D David 40 M # 修改列索引值 df = df.rename({"姓名": "name", "年龄": "age", "性别": "gender"}, axis=1) print(df) # 输出: # name age gender # A Alice 25 F # B Bob 30 M # C Charlie 35 M # D David 40 M
# 将 name 列设置为行索引 df = df.set_index("name") print(df) # 输出: # age gender # name # Alice 25 F # Bob 30 M # Charlie 35 M # David 40 M # 将 name 和 gender 列设置为多级行索引 df = df.set_index(["name", "gender"]) print(df) # 输出: # age # name gender # Alice F 25 # Bob M 30 # Charlie M 35 # David M 40
pd.reset_index是用来把索引重置为默认的整数索引的方法。可以理解为重置列索引,并将pandas默认索引设置为索引
# 将行索引重置为默认的整数索引 df = df.reset_index() print(df) # 输出: # name age gender # 0 Alice 25 F # 1 Bob 30 M # 2 Charlie 35 M # 3 David 40 M