http://blog.csdn.net/pipisorry/article/details/18012125 索引IndexMany of these methods or variants thereof are available on the objectsthat contain an index (Series/Dataframe) and those should most likely beused before calling these methods directly. 從series對象中找到某元素(行)對應(yīng)的索引(如果索引是從0開始的連續(xù)值,那就是行號了) nodes_id_index = pd.Index(nodes_series) print(nodes_id_index.get_loc('u_3223_4017'))[Find element's index in pandas Series] 更多請參考[Index] 檢索/選擇dataframe列選擇和Series一樣,在DataFrame中的一列可以通過字典記法或?qū)傩詠頇z索,返回Series: Note: 返回的Series包含和DataFrame相同的索引,并它們的 name 屬性也被正確的設(shè)置了。 dataframe選擇多列 lines = lines[[0, 1, 4]]或者 lines = lines[['user', 'check-in_time', 'location_id']] dataframe行選擇>>> dates = pd.date_range('20130101', periods=6) >>> dates 行可以直接通過[]選擇,只是必須是數(shù)字范圍或者字符串范圍索引: >>> df['2013-01-02':'2013-01-03'] A B C D >>> df[3:5] Selection by Position ix和iloc行也可以使用一些方法通過位置num或名字label來檢索,例如 ix索引成員(field){更多ix使用實(shí)例可參考后面的“索引,挑選和過濾”部分}。 Note: 提取特定的某列數(shù)據(jù)。Python中,可以使用iloc或者ix屬性,但是ix更穩(wěn)定一些。 ix{行選;行列選}In [45]: frame2.ix['three'] year 2002 state Ohio pop 3.6 debt NaN Name: three df.ix[3] 假設(shè)我們需數(shù)據(jù)第一列的前5行: df.ix[:,0].head() >>> df.ix[1:3, 0:3] #相當(dāng)于 df.ix[1:3, ['A', 'B', 'C']] A B C 2013-01-02 -1.403797 -1.094992 0.304359 2013-01-03 1.137673 0.636973 -0.746928 Select via the position of the passed integers 與ix, [], at的區(qū)別是,iloc[3]選擇是的數(shù)據(jù)第3行,而其它如ix[3]選擇的是索引為3的那一行! In [32]: df.iloc[3] A 0.721555 B -0.706771 C -1.039575 D 0.271860 Name: 2013-01-04 00:00:00, dtype: float64 By integer slices, acting similar to numpy/python In [33]: df.iloc[3:5,0:2] A B 2013-01-04 0.721555 -0.706771 2013-01-05 -0.424972 0.567020 By lists of integer position locations, similar to the numpy/python style In [34]: df.iloc[[1,2,4],[0,2]] A C 2013-01-02 1.212112 0.119209 2013-01-03 -0.861849 -0.494929 2013-01-05 -0.424972 0.276232 For getting fast access to a scalar (equiv to the prior method) In [38]: df.iat[1,1] Out[38]: -0.17321464905330858 For getting a cross section using a label In [26]: df.loc[dates[0]] A 0.469112 B -0.282863 C -1.509059 D -1.135632 Name: 2013-01-01 00:00:00, dtype: float64 Selecting on a multi-axis by label In [27]: df.loc[:,['A','B']] A B 2013-01-01 0.469112 -0.282863 2013-01-02 1.212112 -0.173215 2013-01-03 -0.861849 -2.104569 2013-01-04 0.721555 -0.706771 2013-01-05 -0.424972 0.567020 2013-01-06 -0.673690 0.113648 最快的僅選擇單數(shù)值at[]For getting fast access to a scalar (equiv to the prior method) In [31]: df.at[dates[0],'A'] Out[31]: 0.46911229990718628 布爾索引Boolean IndexingUsing a single column’s values to select data. In [39]: df[df.A > 0] A B C D 2013-01-01 0.469112 -0.282863 -1.509059 -1.135632 2013-01-02 1.212112 -0.173215 0.119209 -1.044236 2013-01-04 0.721555 -0.706771 -1.039575 0.271860 A where operation for getting. In [40]: df[df > 0] A B C D 2013-01-01 0.469112 NaN NaN NaN ... 過濾filteringUsing the isin() method for filtering: In [41]: df2 = df.copy() In [42]: df2['E'] = ['one', 'one','two','three','four','three'] In [43]: df2 A B C D E 2013-01-01 0.469112 -0.282863 -1.509059 -1.135632 one 2013-01-02 1.212112 -0.173215 0.119209 -1.044236 one 2013-01-03 -0.861849 -2.104569 -0.494929 1.071804 two 2013-01-04 0.721555 -0.706771 -1.039575 0.271860 three 2013-01-05 -0.424972 0.567020 0.276232 -1.087401 four 2013-01-06 -0.673690 0.113648 -1.478427 0.524988 three In [44]: df2[df2['E'].isin(['two','four'])] Out[44]: A B C D E 2013-01-03 -0.861849 -2.104569 -0.494929 1.071804 two 2013-01-05 -0.424972 0.567020 0.276232 -1.087401 four 索引,挑選和過濾大多具體的索引規(guī)則見前面的“檢索/選擇”部分 Series索引和整數(shù)索引Series索引( obj[...] )的工作原理類似與NumPy索引,除了可以使用Series的索引值,也可以僅使用整數(shù)索引。 In [102]: obj = Series(np.arange(4.), index=['a', 'b', 'c', 'd']) In [103]: obj['b'] In [104]: obj[1] Out[103]: 1.0 Out[104]: 1.0 In [105]: obj[2:4] In [106]: obj[['b', 'a', 'd']] Out[105]: Out[106]: c 2 b 1 d 3 a 0 d 3 In [107]: obj[[1, 3]] In [108]: obj[obj < 2] b 1 a 0 d 3 b 1 整數(shù)索引 Series的iget_ value 方法、DataFrame 的 irow 和 icol 方法如果你需要可靠的、不考慮索引類型的、基于位置的索引,可以使用Series的iget_ value 方法和 DataFrame 的 irow 和 icol 方法: 標(biāo)簽切片使用標(biāo)簽來切片和正常的Python切片并不一樣,它會把結(jié)束點(diǎn)也包括在內(nèi): In [109]: obj['b':'c'] b 1 c 2 索引賦值使用這些函數(shù)來賦值 In [110]: obj['b':'c'] = 5 In [111]: obj a 0 b 5 c 5 d 3 通過切片或一個(gè)布爾數(shù)組來選擇行,這旨在在這種情況下使得DataFrame的語法更像一個(gè)ndarry。 In [116]: data[:2] In [117]: data[data['three'] > 5] one two three four one two three four Ohio 0 1 2 3 Colorado 4 5 6 7 Colorado 4 5 6 7 Utah 8 9 10 11 New York 12 13 14 15 DataFrame行標(biāo)簽索引 ixDataFrame可以在行上進(jìn)行標(biāo)簽索引,使你可以從DataFrame選擇一個(gè)行和列的子集,使用像NumPy的記法再加上軸標(biāo)簽。這也是一種不是很冗長的重新索引的方法: ![]() ![]() 因此,有很多方法來選擇和重排包含在pandas對象中的數(shù)據(jù)。 DataFrame方法的簡短概要還有分層索引及一些額外的選項(xiàng)。
Note:在設(shè)計(jì)pandas時(shí),我覺得不得不敲下 frame[:, col] 來選擇一列,是非常冗余的(且易出錯(cuò)的),因此列選擇是最常見的操作之一。因此,我做了這個(gè)設(shè)計(jì)權(quán)衡,把所有的富標(biāo)簽索引引入到ix 。 [Different Choices for Indexing] 唯一值、值計(jì)數(shù)以及成員資格唯一值、值計(jì)數(shù)、成員資格方法方法 說明 isin用于判斷矢量化集合的成員資格,可用于選取Series中或DataFrame列中 數(shù)據(jù)的子集: >>> obj=Series(['c','a','d','a','a','b','b','c','c']) obj.unique()# 函數(shù)是unique,它可以得到Series中的唯一值數(shù)組: value_counts用于計(jì)算一個(gè)Series中各值出現(xiàn)的頻率: 查源碼,發(fā)現(xiàn)這個(gè)統(tǒng)計(jì)是通過hashtable實(shí)現(xiàn)的。keys, counts = htable.value_count_scalar64(values, dropna) 統(tǒng)計(jì)數(shù)組或序列所有元素出現(xiàn)次數(shù)pd.value_countsvalue_counts還是一個(gè)頂級pandas方法,可用于任何數(shù)組或序列: 返回一個(gè)pandas.series對象,不過你基本可以將它當(dāng)成dict一樣使用。 當(dāng)然也可以減去一些判斷,直接使用pandas.value_counts()調(diào)用的hashtable統(tǒng)計(jì)方法(lz在源碼中看到的) import pandas.hashtable as htable values = np.array([1, 2, 3, 5, 1, 3, 3, 2, 3, 5]) values_cnts = dict(zip(*htable.value_count_scalar64(values, dropna=True))) print(values_cnts) apply應(yīng)用于DataFrame有時(shí),可能希望得到DataFrame中多個(gè)相關(guān)列的一張柱狀圖。例如: Qu1 Qu2 Qu3 將 pandas.value_counts 傳給該 DataFrame 的 apply 函數(shù): 索引對象obj.indexpandas的索引對象用來保存坐標(biāo)軸標(biāo)簽和其它元數(shù)據(jù)(如坐標(biāo)軸名或名稱)。構(gòu)建一個(gè)Series或DataFrame時(shí)任何數(shù)組或其它序列標(biāo)簽在內(nèi)部轉(zhuǎn)化為索引: In [68]: obj = Series(range(3), index=['a', 'b', 'c']) In [69]: index = obj.index In [70]: index Out[70]: Index([a, b, c], dtype=object) In [71]: index[1:] Out[71]: Index([b, c], dtype=object) 不可變性索引對象是不可變的,因此不能由用戶改變: In [72]: index[1] = 'd' Exception Traceback (most recent call last)... Exception: <class 'pandas.core.index.Index'> object is immutable 索引對象的不可變性非常重要,這樣它可以在數(shù)據(jù)結(jié)構(gòu)中結(jié)構(gòu)中安全的共享: In [73]: index = pd.Index(np.arange(3)) In [74]: obj2 = Series([1.5, -2.5, 0], index=index) In [75]: obj2.index is index Out[75]: True pandas中的主要索引對象表格 是庫中內(nèi)建的索引類清單。通過一些開發(fā)努力,索引可以被子類化,來實(shí)現(xiàn)特定坐標(biāo)軸索引功能。多數(shù)用戶不必要知道許多索引對象的知識,但是它們?nèi)匀皇莗andas數(shù)據(jù)模型的重要部分。
固定大小集合功能除了類似于陣列,索引也有類似固定大小集合一樣的功能 In [76]: frame3 state Nevada Ohio year 2000 NaN 1.5 2001 2.4 1.7 2002 2.9 3.6 In [77]: 'Ohio' in frame3.columns Out[77]: True In [78]: 2003 in frame3.index Out[78]: False 索引方法和屬性每個(gè)索引都有許多關(guān)于集合邏輯的方法和屬性,且能夠解決它所包含的數(shù)據(jù)的常見問題。
重建索引reindexpandas對象的一個(gè)關(guān)鍵的方法是 reindex ,意味著使數(shù)據(jù)符合一個(gè)新的索引來構(gòu)造一個(gè)新的對象。 reindex更多的不是修改pandas對象的索引,而只是修改索引的順序,如果修改的索引不存在就會使用默認(rèn)的None代替此行。且不會修改原數(shù)組,要修改需要使用賦值語句。
In [79]: obj = Series([4.5, 7.2, -5.3, 3.6], index=['d', 'b', 'a', 'c']) In [80]: obj d 4.5 b 7.2 a -5.3 c 3.6 reindex 重排數(shù)據(jù)(行索引)在Series上調(diào)用 reindex 重排數(shù)據(jù),使得它符合新的索引,如果那個(gè)索引的值不存在就引入缺失數(shù)據(jù)值: In [81]: obj2 = obj.reindex(['a', 'b', 'c', 'd', 'e'])
In [82]: obj2
a -5.3
b 7.2
c 3.6
d 4.5
e NaN
In [83]: obj.reindex(['a', 'b', 'c', 'd', 'e'], fill_value=0)
a -5.3
b 7.2
c 3.6
d 4.5
e 0.0
重建索引的內(nèi)插或填充method為了對時(shí)間序列這樣的數(shù)據(jù)排序,當(dāng)重建索引的時(shí)候可能想要對值進(jìn)行內(nèi)插或填充。 method 選項(xiàng)可以是你做到這一點(diǎn),使用一個(gè)如ffill 的方法來向前填充值: In [84]: obj3 = Series(['blue', 'purple', 'yellow'], index=[0, 2, 4]) In [85]: obj3.reindex(range(6), method='ffill') 0 blue 1 blue 2 purple 3 purple 4 yellow 5 yellow method 選項(xiàng)的清單
對于DataFrame, reindex 可以改變(行)索引,列或兩者。當(dāng)只傳入一個(gè)序列時(shí),結(jié)果中的行被重新索引了: In [86]: frame = DataFrame(np.arange(9).reshape((3, 3)), index=['a', 'c', 'd'], columns=['Ohio', 'Texas', 'California']) In [87]: frame Ohio Texas California a 0 1 2 c 3 4 5 d 6 7 8 列重新索引關(guān)鍵字columns使用 columns 關(guān)鍵字可以是列重新索引: In [90]: states = ['Texas', 'Utah', 'California'] In [91]: frame.reindex(columns=states) Texas Utah California a 1 NaN 2 c 4 NaN 5 d 7 NaN 8 DataFrame重命名列columns方法2: df.rename(columns={'age': 'x', 'fat_percent': 'y'}) 行列同時(shí)重新索引2種方式一次可以對兩個(gè)重新索引,可是插值只在行側(cè)(0坐標(biāo)軸)進(jìn)行: In [92]: frame.reindex(index=['a', 'b', 'c', 'd'], method='ffill', columns=states) Texas Utah California a 1 NaN 2 b 1 NaN 2 c 4 NaN 5 d 7 NaN 8 正如你將看到的,使用帶標(biāo)簽索引的 ix 可以把重新索引做的更簡單: In [93]: frame.ix[['a', 'b', 'c', 'd'], states] Texas Utah California a 1 NaN 2 b NaN NaN NaN c 4 NaN 5 d 7 NaN 8 DataFrame索引和列的互轉(zhuǎn)set_index reset_index人們經(jīng)常想要將DataFrame的一個(gè)或多個(gè)列當(dāng)做行索引來用,或者可能希望將行索引變成DataFrame的列。以下面這個(gè)DataFrame為例:frame = pd.DataFrame({'a': range(7),'b': range(7, 0, -1),'c': ['one','one','one','two','two','two', 'two'],'d': [0, 1, 2, 0, 1, 2, 3]}) frame a b c d 0 0 7 one 0 1 1 6 one 1 2 2 5 one 2 3 3 4 two 0 4 4 3 two 1 5 5 2 two 2 6 6 1 two 3 列轉(zhuǎn)換為行索引set_indexDataFrame的set_index函數(shù)會將其一個(gè)或多個(gè)列轉(zhuǎn)換為行索引,創(chuàng)建一個(gè)新的 DataFrame :frame2 = frame.set_index(['c', 'd']) In [6]: frame2 a b c d one 0 0 7 1 1 6 2 2 5 two 0 3 4 1 4 3 2 5 2 3 6 1 默認(rèn)情況下,那些列會從DataFrame中移除,但也可以將其保留下來: frame.set_index(['c','d'], drop=False) a b c d c d one 0 0 7 one 0 1 1 6 one 1 2 2 5 one 2 two 0 3 4 two 0 1 4 3 two 1 2 5 2 two 2 3 6 1 two 3 [沒有reduce的分組參考group部分] 索引的級別會被轉(zhuǎn)移到列reset_indexreset_index的功能跟set_index剛好相反,層次化索引的級別會被轉(zhuǎn)移到列里面:frame2.reset_index() c d a b 0 one 0 0 7 1 one 1 1 6 2 one 2 2 5 3 two 0 3 4 4 two 1 4 3 5 two 2 5 2 6 two 3 6 1 [MultiIndex / Advanced Indexing] 顯式拷貝索引DataFrame時(shí)返回的列是底層數(shù)據(jù)的一個(gè)視窗,而不是一個(gè)拷貝。因此,任何在Series上的就地修改都會影響DataFrame。列可以使用Series的copy 函數(shù)來顯示拷貝。Note:While standard Python / Numpy expressions for selecting and setting are intuitive and come in handy for interactive work, for production code, were commend the optimized pandas data access methods,.at,.iat,.loc,.ilocand.ix.
SettingWithCopyWarning提示SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFramedf[len(df.columns) - 1][df[len(df.columns) - 1] > 0.0] = 1.0 這個(gè)warning主要是第二個(gè)索引導(dǎo)致的,就是說第二個(gè)索引是copy的。 奇怪的是,df的確已經(jīng)修改了,而warnning提示好像是說修改被修改到df的一個(gè)copy上了。所以這里只是一個(gè)warnning,只是說和內(nèi)存有關(guān),可能賦值不上,也可能上了。 且print(df[len(df.columns) - 1][df[len(df.columns) - 1] > 0.0].is_copy)輸出None,怎么就輸出None,而不是True或者False? 解決 修改df原本數(shù)據(jù)時(shí)建議使用loc,但是要注意行列的索引位置Try using .loc[row_indexer,col_indexer] = value instead df.loc[df[len(df.columns) - 1] > 0.0, len(df.columns) - 1] = 1.0不建議設(shè)置不提示:pd.options.mode.chained_assignment = None # default='warn' 參考前面why .ix is a bad idea部分 [為什么有這種warnning的官方解釋:Returning a view versus a copy?] [Pandas SettingWithCopyWarning] [How to deal with SettingWithCopyWarning in Pandas?] Why .ix is a bad idea通過.ix選擇的數(shù)據(jù)是一個(gè)copy的數(shù)據(jù),修改這個(gè)選擇不會修改原數(shù)據(jù),而.loc是修改原數(shù)據(jù)。 The .ix object tries to do more than one thing, and for anyone who has read anything about clean code, this is a strong smell. Given this dataframe: df = pd.DataFrame({"a": [1,2,3,4], "b": [1,1,2,2]}) Two behaviors: dfcopy = df.ix[:,["a"]] dfcopy.a.ix[0] = 2 Behavior one: dfcopy is now a stand alone dataframe. Changing it will not change df df.ix[0, "a"] = 3 Behavior two: This changes the original dataframe. Use .loc instead The pandas developers recognized that the .ix object was quite smelly[speculatively] and thus created two new objects which helps in the accession and assignment of data. .loc is faster, because it does not try to create a copy of the data. .loc is meant to modify your existing dataframe inplace, which is more memory efficient. .loc is predictable, it has one behavior. [Returning a view versus a copy] 帶有重復(fù)值的軸索引帶有重復(fù)索引值的Series>>>obj = Series(range(5), index=['a','a','b','b','c']) 索引的is_unique屬性驗(yàn)證是否是唯一的 帶有重復(fù)值索引的數(shù)據(jù)選取如果某個(gè)索引對應(yīng)多個(gè)值,則 返回一個(gè)Series;而對應(yīng)單個(gè)值的,則返回一個(gè)標(biāo)量值。>>>obj['a'] a 0 a 1 >>>obj['c'] 4 對DataFrame的行進(jìn)行索引時(shí)也是如此: >>> df = DataFrame(np.random.randn(4, 3), index=['a','a','b','b']) >>>df >>> df.ix['b'] 層次化索引層次化索引(hierarchical indexing)是pandas的一項(xiàng)重要功能,它能在一個(gè)軸上擁有多個(gè)(兩個(gè)以上)索引級別。抽象點(diǎn)說,它使能以低維度形式處理高維度數(shù)據(jù)。Series創(chuàng)建一個(gè)Series,并用一個(gè)由列表或數(shù)組組成的列表作為索引data = pd.Series(np.random.randn(10), index=[['a','a','a','b','b','b','c','c','d','d'], [1, 2, 3, 1, 2, 3, 1, 2, 2, 3]])In [6]: data a 1 0.382928 2 -0.360273 3 -0.533257 b 1 0.341118 2 0.439390 3 0.645848 c 1 0.006016 2 0.700268 d 2 0.405497 3 0.188755 dtype: float64 這就是帶有Multilndex索引的Series的格式化輸出形式。索引之間的“間隔”表示“直 接使用上面的標(biāo)簽”。 >>> data.index MultiIndex(levels=[[u'a', u'b', u'c', u'd'], [1, 2, 3]], labels=[[0, 0, 0, 1, 1, 1, 2, 2, 3, 3], [0, 1, 2, 0, 1, 2, 0, 1, 1, 2]]) 層次化索引的對象選取數(shù)據(jù)子集In [8]: data['b':'c']b 1 0.341118 2 0.439390 3 0.645848 c 1 0.006016 2 0.700268 dtype: float64 In [10]: data.ix[['b', 'd']] b 1 0.341118 2 0.439390 3 0.645848 d 2 0.405497 3 0.188755 dtype: float64 內(nèi)層”中進(jìn)行選取 In [11]: data[:, 2] a -0.360273 b 0.439390 c 0.700268 d 0.405497 dtype: float64 層次化索引在數(shù)據(jù)重塑和基于分組的操作:堆疊和反堆疊(如透視表生成)中扮演著重要的角色 可通過其unstack方法被重新安排到一個(gè)DataFrame中: In [12]: data.unstack() 1 2 3 a 0.382928 -0.360273 -0.533257 b 0.341118 0.439390 0.645848 c 0.006016 0.700268 NaN d NaN 0.405497 0.188755 #unstack的逆運(yùn)覽是stack:data.unstack().stack() DataFrame對于一個(gè)DataFrame,每條軸都可以有分層索引:frame = pd.DataFrame(np.arange(12).reshape((4, 3)),index=[['a','a','b','b'], [1, 2, 1, 2]],columns=[['Ohio','Ohio','Colorado'], ['Green','Red','Green']]) In [16]: frame Ohio Colorado Green Red Green a 1 0 1 2 2 3 4 5 b 1 6 7 8 2 9 10 11 各層都可以有名字(可以是字符串,也可以是別的Python對象)。如果指定了名稱,它 們就會顯示在控制臺輸出中(不要將索引名稱跟軸標(biāo)簽混為一談?。?br>In [18]: frame.index.names = ['key1','key2'] In [19]: frame.columns.names = ['state', 'color'] In [20]: frame state Ohio Colorado color Green Red Green key1 key2 a 1 0 1 2 2 3 4 5 b 1 6 7 8 2 9 10 11 分部的列索引選取列分組In [21]: frame['Ohio']color Green Red key1 key2 a 1 0 1 2 3 4 b 1 6 7 2 9 10 單獨(dú)創(chuàng)建Multilndex復(fù)用pd.MultiIndex.from_arrays([['Ohio', 'Ohio', 'Colorado'],['Green','Red', 'Green']],names=['state', 'color'])重排分級順序swaplevel和sortlevel如需要重新調(diào)整某條軸上各級別的順序,或根據(jù)指定級別上的值對數(shù)據(jù)進(jìn)行排序。調(diào)整某條軸上各級別的順序swaplevelswaplevel接受兩個(gè)級別編號或名稱,并返回一個(gè)互換了級別的新對象(但數(shù)據(jù)不會發(fā)生變化): In [24]: frame state Ohio Colorado color Green Red Green key1 key2 a 1 0 1 2 2 3 4 5 b 1 6 7 8 2 9 10 11 In [25]: frame.swaplevel('key1','key2') state Ohio Colorado color Green Red Green key2 key1 1 a 0 1 2 2 a 3 4 5 1 b 6 7 8 2 b 9 10 11 Note: 同frame.swaplevel(0,1)? 指定級別上的值對數(shù)據(jù)進(jìn)行排序sortlevel而sortlevel則根據(jù)單個(gè)級別中的值對數(shù)據(jù)進(jìn)行排序(穩(wěn)定的)。交換級別時(shí),常常也會 用到sortlevel,這樣最終結(jié)果就是有序的了:In [26]: frame.sortlevel(1) state Ohio Colorado color Green Red Green key1 key2 a 1 0 1 2 b 1 6 7 8 a 2 3 4 5 b 2 9 10 11 In [27]: frame.swaplevel(0,1).sortlevel(0) state Ohio Colorado color Green Red Green key2 key1 1 a 0 1 2 b 6 7 8 2 a 3 4 5 b 9 10 11 Note:在層次化索引的對象上,如果索引是按字典方式從外到內(nèi)排序(即調(diào)用sortlevel(0)或 sort_index()的結(jié)果),數(shù)據(jù)選取操作的性能要好很多。 根據(jù)級別匯總統(tǒng)計(jì)許多對DataFrame和Series的描述和匯總統(tǒng)計(jì)都有一個(gè)level選項(xiàng),它用于指定在某條軸上求和的級別,根據(jù)行或列上的級別來進(jìn)行求和In [29]: frame state Ohio Colorado color Green Red Green key1 key2 a 1 0 1 2 2 3 4 5 b 1 6 7 8 2 9 10 11 In [30]: frame.sum(level='key2') state Ohio Colorado color Green Red Green key2 1 6 8 10 2 12 14 16 In [33]: frame.sum(level='color',axis=1) color Green Red key1 key2 a 1 2 1 2 8 4 b 1 14 7 2 20 10 In [35]: frame.sum(level='color') ... AssertionError: Level color not in index [MultiIndex / Advanced Indexing] from: http://blog.csdn.net/pipisorry/article/details/18012125 ref: [Indexing and Selecting Data?]* |
|