pandas调整列的顺序以及添加列的实现_Python

在对excel的操作中，调整列的顺序以及添加一些列也是经常用到的，下面我们用pandas实现这一功能。

1、调整列的顺序

				?

									>>> df = pd.read_excel(r'D:/myExcel/1.xlsx')

									>>> df

									  A B C D

									0  bob 12 78 87

									1 millor 15 92 21

									>>> df.columns

									Index(['A', 'B', 'C', 'D'], dtype='object')

									# 这是最简单常用的一种方法，相当于指定列名让pandas

									# 从df中获取

									>>> df[['A', 'D', 'C', 'B']]

									  A D C B

									0  bob 87 78 12

									1 millor 21 92 15

									# 这也是可以的

									>>> df[['A', 'A', 'A', 'A']]

									  A  A  A  A

									0  bob  bob  bob  bob

									1 millor millor millor millor

2、添加某一列或者某几列

（1）直接添加

				?

									>>> df['E']=[1, 2]

									>>> df

									  A B C D E

									0  bob 12 78 87 1

									1 millor 15 92 21 2

（2）调用assign方法。该方法善于根据已有的列添加新的列，通过基本运算，或者调用函数

				?

									>>> df

									  A B C D

									0  bob 12 78 87

									1 millor 15 92 21

									# 其中E是列名，根据B列-C列的值得到

									>>> df.assign(E=df['B'] - df['C'])

									  A B C D E

									0  bob 12 78 87 -66

									1 millor 15 92 21 -77

									# 添加两列也可以

									>>> df.assign(E=df['B'] - df['C'], F=df['B'] * df['C'])

									  A B C D E  F

									0  bob 12 78 87 -66 936

									1 millor 15 92 21 -77 1380

哈哈，以上就是pandas关于调整列的顺序以及新增列的用法

补充：pandas修改DataFrame中的列名&调整列的顺序

修改列名：

直接调用接口：

				?

									df.rename()

看一下接口中的定义：

				?

									def rename(self, *args, **kwargs):

									 """

									 Alter axes labels.

									 Function / dict values must be unique (1-to-1). Labels not contained in

									 a dict / Series will be left as-is. Extra labels listed don't throw an

									 error.

									 See the :ref:`user guide <basics.rename>` for more.

									 Parameters

									 ----------

									 mapper, index, columns : dict-like or function, optional

									  dict-like or functions transformations to apply to

									  that axis' values. Use either ``mapper`` and ``axis`` to

									  specify the axis to target with ``mapper``, or ``index`` and

									  ``columns``.

									 axis : int or str, optional

									  Axis to target with ``mapper``. Can be either the axis name

									  ('index', 'columns') or number (0, 1). The default is 'index'.

									 copy : boolean, default True

									  Also copy underlying data

									 inplace : boolean, default False

									  Whether to return a new DataFrame. If True then value of copy is

									  ignored.

									 level : int or level name, default None

									  In case of a MultiIndex, only rename labels in the specified

									  level.

									 Returns

									 -------

									 renamed : DataFrame

									 See Also

									 --------

									 pandas.DataFrame.rename_axis

									 Examples

									 --------

									 ``DataFrame.rename`` supports two calling conventions

									 * ``(index=index_mapper, columns=columns_mapper, ...)``

									 * ``(mapper, axis={'index', 'columns'}, ...)``

									 We *highly* recommend using keyword arguments to clarify your

									 intent.

									 >>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})

									 >>> df.rename(index=str, columns={"A": "a", "B": "c"})

									  a c

									 0 1 4

									 1 2 5

									 2 3 6

									 >>> df.rename(index=str, columns={"A": "a", "C": "c"})

									  a B

									 0 1 4

									 1 2 5

									 2 3 6

									 Using axis-style parameters

									 >>> df.rename(str.lower, axis='columns')

									  a b

									 0 1 4

									 1 2 5

									 2 3 6

									 >>> df.rename({1: 2, 2: 4}, axis='index')

									  A B

									 0 1 4

									 2 2 5

									 4 3 6

									 """

									 axes = validate_axis_style_args(self, args, kwargs, 'mapper', 'rename')

									 kwargs.update(axes)

									 # Pop these, since the values are in `kwargs` under different names

									 kwargs.pop('axis', None)

									 kwargs.pop('mapper', None)

									 return super(DataFrame, self).rename(**kwargs)

注意：

一个*，输入可以是数组、元组，会把输入的数组或元组拆分成一个个元素。

两个*，输入必须是字典格式

示例：

				?

									>>>import pandas as pd

									>>>a = pd.DataFrame({'A':[1,2,3], 'B':[4,5,6], 'C':[7,8,9]})

									>>> a 

									 A B C

									0 1 4 7

									1 2 5 8

									2 3 6 9

									#将列名A替换为列名a，B改为b，C改为c

									>>>a.rename(columns={'A':'a', 'B':'b', 'C':'c'}, inplace = True)

									>>>a

									 a b c

									0 1 4 7

									1 2 5 8

									2 3 6 9

调整列的顺序：

如：

				?

									>>> import pandas

									>>> dict_a = {'user_id':['webbang','webbang','webbang'],'book_id':['3713327','4074636','26873486'],'rating':['4','4','4'],

									'mark_date':['2017-03-07','2017-03-07','2017-03-07']}

									>>> df = pandas.DataFrame(dict_a) # 从字典创建DataFrame

									>>> df # 创建好的df列名默认按首字母顺序排序，和字典中的先后顺序并不一样，字典中'user_id','book_id','rating','mark_date'

									 book_id mark_date rating user_id

									0 3713327 2017-03-07 4 webbang

									1 4074636 2017-03-07 4 webbang

									2 26873486 2017-03-07 4 webbang

直接修改列名：

				?

									>>> df = df[['user_id','book_id','rating','mark_date']] # 调整列顺序为'user_id','book_id','rating','mark_date'

									>>> df

									 user_id book_id rating mark_date

									0 webbang 3713327 4 2017-03-07

									1 webbang 4074636 4 2017-03-07

									2 webbang 26873486 4 2017-03-07