约定
1
2
3
|
import pandas as pd from pandas import DataFrame import numpy as np |
MultiIndex
MultiIndex表示多级索引,它是从Index继承过来的,其中多级标签用元组对象来表示。
一、创建MultiIndex对象
创建方式一:元组列表
1
2
|
m_index1 = pd.Index([( "A" , "x1" ),( "A" , "x2" ),( "B" , "y1" ),( "B" , "y2" ),( "B" , "y3" )],name = [ "class1" , "class2" ]) m_index1 |
代码结果:
1
2
3
|
MultiIndex(levels = [[ 'A' , 'B' ], [ 'x1' , 'x2' , 'y1' , 'y2' , 'y3' ]], labels = [[ 0 , 0 , 1 , 1 , 1 ], [ 0 , 1 , 2 , 3 , 4 ]], names = [ 'class1' , 'class2' ]) |
1
2
|
df1 = DataFrame(np.random.randint( 1 , 10 ,( 5 , 3 )),index = m_index1) df1 |
代码结果:
0 | 1 | 2 | ||
---|---|---|---|---|
class1 | class2 | |||
A | x1 | 7 | 4 | 8 |
x2 | 4 | 5 | 2 | |
B | y1 | 6 | 9 | 7 |
y2 | 2 | 1 | 6 | |
y3 | 6 | 8 | 6 |
创建方式二:特定结构
例如**from_arrays()
1
2
3
4
|
class1 = [ "A" , "A" , "B" , "B" ] class2 = [ "x1" , "x2" , "y1" , "y2" ] m_index2 = pd.MultiIndex.from_arrays([class1,class2],names = [ "class1" , "class2" ]) m_index2 |
代码结果:
1
2
3
|
MultiIndex(levels = [[ 'A' , 'B' ], [ 'x1' , 'x2' , 'y1' , 'y2' ]], labels = [[ 0 , 0 , 1 , 1 ], [ 0 , 1 , 2 , 3 ]], names = [ 'class1' , 'class2' ]) |
1
2
|
df2 = DataFrame(np.random.randint( 1 , 10 ,( 4 , 3 )),index = m_index2) df2 |
代码结果:
0 | 1 | 2 | ||
---|---|---|---|---|
class1 | class2 | |||
A | x1 | 2 | 4 | 5 |
x2 | 3 | 5 | 9 | |
B | y1 | 7 | 1 | 2 |
y2 | 3 | 1 | 8 |
创建方式三:笛卡尔积
from_product()从多个集合的笛卡尔积创建MultiIndex对象。
1
2
|
m_index3 = pd.MultiIndex.from_product([[ "A" , "B" ],[ 'x1' , 'y1' ]],names = [ "class1" , "class2" ]) m_index3 |
代码结果:
1
2
3
|
MultiIndex(levels = [[ 'A' , 'B' ], [ 'x1' , 'y1' ]], labels = [[ 0 , 0 , 1 , 1 ], [ 0 , 1 , 0 , 1 ]], names = [ 'class1' , 'class2' ]) |
1
2
|
df3 = DataFrame(np.random.randint( 1 , 10 ,( 2 , 4 )),columns = m_index3) df3 |
代码结果:
class1 | A | B | ||
---|---|---|---|---|
class2 | x1 | y1 | x1 | y1 |
0 | 2 | 9 | 1 | 8 |
1 | 5 | 2 | 5 | 2 |
二、MultiIndex对象属性
1
|
df1 |
代码结果:
0 | 1 | 2 | ||
---|---|---|---|---|
class1 | class2 | |||
A | x1 | 7 | 4 | 8 |
x2 | 4 | 5 | 2 | |
B | y1 | 6 | 9 | 7 |
y2 | 2 | 1 | 6 | |
y3 | 6 | 8 | 6 |
1
2
|
m_index4 = df1.index print (in1[ 0 ]) |
代码结果:
('A', 'x1')
调用.get_loc()和.get_indexer()获取标签的下标:
1
2
|
print (m_index4.get_loc(( "A" , "x2" ))) print (m_index4.get_indexer([( "A" , "x2" ),( "B" , "y1" ), "nothing" ])) |
代码结果:
1
[ 1 2 -1]
MultiIndex对象使用多个Index对象保存索引中每一级的标签:
1
2
|
print (m_index4.levels[ 0 ]) print (m_index4.levels[ 1 ]) |
代码结果:
1
2
|
Index([ 'A' , 'B' ], dtype = 'object' , name = 'class1' ) Index([ 'x1' , 'x2' , 'y1' , 'y2' , 'y3' ], dtype = 'object' , name = 'class2' ) |
MultiIndex对象还有属性labels保存标签的下标:
1
2
|
print (m_index4.labels[ 0 ]) print (m_index4.labels[ 1 ]) |
代码结果:
1
2
|
FrozenNDArray([ 0 , 0 , 1 , 1 , 1 ], dtype = 'int8' ) FrozenNDArray([ 0 , 1 , 2 , 3 , 4 ], dtype = 'int8' ) |
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持服务器之家。
原文链接:https://blog.csdn.net/weixin_38168620/article/details/79580272