reversed("andelf")== "fledna": 2009

2009年7月31日星期五

这是逻辑? Three Idols

好玩的题, 题目是这样的:

这里我们有三个 God, A, B, C. 对应着 True-God, False-God, Random-God, 当然, 你不知道具体对应关系如何. True-God 的回答总是正确的, False-God 的回答总是错误的, 而 Random-God 的回答是随机的. God 使用单词 ja, da 代表 Yes 和 No, 当然, 你也不知道具体哪个是 Yes, 哪个是 No. 你有三次机会, 每次只能问其中一个 God 问题, 问如何判断 A B C 的对应关系. 

在看了提示后想到如下解决方法:

题目对于 Random-God 的行为表述不是很清楚, 比如回答前是否需要知道答案. 这里我们假设 Random-God 在确立随机性前需要知道答案是什么.

我们使用一道邪恶的题目, "你对本题的回答是永远一样的么?", 这样 False-God 和 Random-God 遇到这题就傻B了, 而 True-God 的答案(无论是 ja, da) 对应着 Yes.

然而, 虽然这个问题很邪恶, 但是无法判断连着两次提问遇到傻 B 的情况. 判断树深度大于 3, so, 解决方法不行.

继续看提示, 发现可以这么问, 比如问 A, "如果我问'B 是 Random-God 么?', 你会回答 `ja` 对么?", 这么一来, 如果回答的是 `ja`, 那么要么 A 是 Random-God, 给出了随机答案; 要么 B 是 Random-God(这两种情况下, C 都不会是 Random-God).  如果回答的是 `da`, 则要么 A 是 Random-God; 要么 B 不是 Random-God(这两种情况下, B都不会是 Random-God).

这样一来, 我们起码知道了一个不是 Random-God, 假设它为 C, 问它 "如果我问'你是 True-God 么?', 你会回答 `ja` 对么?", 那么他回答 `ja` 就代表它是..... 如果他回答 `da` 就代表它是......

继续问这个 C, "如果我问'B 是 Random-God 么?', 你会回答 `ja` 对么?",..... Orz....

很绕啊~~~

"如果我问'你明白了么?', 你会回答 `ja` 对么?"

2009年7月29日星期三

解决 End of file during parsing: .emacs

其实...是括号不匹配. 解决方法 M-x check-parens 找到不匹配的括号.

2009年7月15日星期三

Compile haskell package on win32

Many haskell packages depends on some libs, for e.g. libssl, libcurl, ... so, when that goes to win32, things become complicated... I find an easy way to do it: Install Dev-cpp and use its devpackage system. There are lots of packages for c/c++, so when compile haskell packages: simply pass --extra-include-dirs= and --extra-lib-dirs= to Setup.[l]hs

2009年7月14日星期二

质数/素数 Haskell 求解

在 ProjectEuler 经常遇到质数求解的问题, 需要一个够强大的算法. 尝试自己写了个, 用了 Memoization:

primeList :: [Integer]
primeList
= 2:3:5:[p | p <- [7,9..]
      , all ((0 /=) . mod p) $ takeWhile (<= ceiling $ sqrt $ fromInteger p) primeList ]

在 Hugs 下测试:

Main> primeList !! 1000
7927 :: Integer
(3341273 reductions, 4756123 cells, 5 garbage collections)

貌似不错, 但是很明显, 代码适用范围有限, 例如 primeList !! 10000 在半天后终于出结果. ssword 提供了一个更短的算法, 目测貌似没我的效率高:

primes :: [Integer]
primes = sieve [2..]
    where sieve (p : xs) = p : sieve [x | x <- xs, x `mod` p > 0]

代码相当巧妙, 很不错的筛选法, 但是效率实测... 估计是时间花在 [] 上和不断的回朔上, Lazy 嘛~ 发现用到下个值只好回朔到 2, 3, 5, 7, 11, 13....

Main> primes !! 1000
7927 :: Integer
(35111063 reductions, 51142767 cells, 59 garbage collections)

以上算法都只是处理小质数数列, 遇到大质数.... 全部死掉. PS, stack overflow. 熟悉的错误. 我找到了这个 http://en.wikipedia.org/wiki/Miller-Rabin_test, 感觉不错, 下面的 Haskell 算法是自己写的, 很丑陋.... 根据 Python 代码改的. Miller Rabin 筛选法.

-- Miller-Rabin methold By Andelf
-- true for p < 10^16 except `46 856 248 255 981’, so you can add 13
-- optimize: order of and, or, any, all/ all judge to odd/even
isPrime :: Integer -> Bool
isPrime p
    | p == 2          = True 
    | even p || p < 2 = False
    | otherwise    
        = all (isPrimeTest p) (takeWhile (< p) [2, 3, 7, 61, 24251])
    where 
      isPrimeTest :: Integer -> Integer -> Bool   
      isPrimeTest n a 
          = odd d || t == n - 1 
          where 
            (t, d) = loop (pow a (n-1) n) (shiftTillOdd (n - 1))
            loop t d = if t /= 1 && d /= n - 1 && t /= n - 1 
                       then loop (t * t `mod` n) (d + d) 
                       else (t, d)
            shiftTillOdd = until odd $ flip div 2
            pow :: Integer -> Integer -> Integer -> Integer 
            pow a d n
                | even d            = pow (a * a `mod` n) (d `div` 2) n `mod` n 
                | d == 1 || a == 1  = a
                | otherwise         = pow (a * a `mod` n) (d `div` 2) n * a `mod` n

本来想用 AKS 的, 后来发现在多项式判断那块时间太长, 所以还是采用了伪素数算法, 你所看到的第 2 行注释, p < 10^16 except `46 856 248 255 981’ 是没问题的. 代码经过简单优化, guard 判断的分支顺序可能很诡异, mod, even, odd, 等函数的使用可能也很诡异, 但请相信这个是在 Hugs 下测试过的, 目前是最优. div/(+) 就是比 Data.Bits 快很多. 很诡异. ghc 同学请自己权衡. 再次动手可能在 pow 函数上. [2, 3, 7, 61, 24251] 的选择请参考链接.

Main> head $ filter isPrime [10 ^ 100..]
100000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000267 :: Integer
(8965148 reductions, 25399307 cells, 29 garbage collections)

参考链接

不是最优, 仅供参考.

2009年6月21日星期日

ZMI中对象简单介绍

ZMI中对象简单介绍
对象名	说明
portal_porperties	站点属性设置。site_properties:站点属性，navtree_properties：导航属性
portal_action	站点操作项
portal_controlpanel	站点控制面板操作项
portal_actionicons	站点操作项图标
portal_type	内容类型管理
portal_factory	内容创建控制
content_type_registry	内容和文件映射
portaL_registry	用户注册
portal_membership	用户管理
acl_users	用户源
portal_memberdata	用户属性
portal_skin	站点皮肤管理
portal_setup	站点设置管理
portal_catalog	索引维护
portal_migration	站点升级
portal_atct	ATcontentType升级
mimetype_registry	文件类型注册表
portal_transform	内容转换
portal_discussion	站点评注引擎
portal_quickinstaller	安装和卸载产品
error_log	错误日志
MailHost	邮件设置

2009年4月23日星期四

Guido hates tail recursion elimination

大叔对尾递归还是有点意见的....或许大叔对 FP 不感冒. 其实对于某某某无比吹捧 FP 还是很反感的, 或许搞技术的大都凌驾吧. 不过侧面能说明一些问题, 如果 python 支持尾递归优化, 那么.... 其实 python 对 lambda 等的支持很有限. 大叔说: 简言之, 为什么 Python 不用呢, 因为不 Pythonic. 具体就是: TRE is incompatible with nice stack traces. Python 不会让程序员写基于 TRE 特性的代码.

Once tail recursion elimination exists, developers will start writing code that dependson it, and their code won't run on implementations that don't provide it: a typical Python implementation allows 1000 recursions, which is plenty for non-recursively written code and for code that recurses to traverse, for example, a typical parse tree, but not enough for a recursively written loop over a large list.

还有更牛的理由

I don't believe in recursion as the basis of all programming.

; 这个我也想对某些迷信 FP 的人说.

最后, 它提到了一些 Python 中如果使用 TRE 的后果, 如果实现 TRE, 那么将有更多的问题需要解决. 所以, 既然不是 FP, 就没必要整那么多 FP 的东西. lambda, 方便而已, 不是让你整 FP 的.

欢迎拜见大叔:

http://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html

2009年4月12日星期日

python 中 lambda 的倒掉

灵感来自: http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/ 长话短说, 先看代码.

>>> fs = [(lambda n: i + n) for i in range(10)]
>>> fs[3](4)
13

不觉得该是 3 + 4 = 7 么?

>>> [f(4) for f in fs]
[13, 13, 13, 13, 13, 13, 13, 13, 13, 13]

.....撞石头上了... 换 def 看:

>>> fs = []
>>> for i in range(10):
...    def f(n): return i+n
...    fs.append(f)
...
>>> [f(4) for f in fs]
[13, 13, 13, 13, 13, 13, 13, 13, 13, 13]

其实这时候已经有点眉目了, 继续看下去:

>>> fs = []
>>> for i in range(10):
...    def f(n, i=i): return i+n
...    fs.append(f)
...
>>> [f(4) for f in fs]
[4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

作者提到 Haskell 是没问题的:

Prelude> let fs = [(\n -> i + n) | i <- [0..9]] 
Prelude> [f(4) | f <- fs] 
[4,5,6,7,8,9,10,11,12,13]

想也是~ Haskell 就是做这个的. 怎么回事呢? 我觉得是 lexical scope(static scope) 和 dynamic scope 的关系. lisp 中一般都会讲到这个. 在 def 使用时就已经可以看出, i 在这里作为全局变量了(或者说是上一级变量), 而调用的时候使用的是当前全局环境中的 i. 而 def f(n, i=i) 中, 局部 i 覆盖了全局的 i, 那么也就是说, def 参数列表中默认值参数在定义后就已经绑定.

2009年4月11日星期六

python 的 yield

以前只是类似生成器表达式那样使用... 其实有N多牛B用法. 真长见识~! http://www.dabeaz.com/coroutines/index.html 例如:

# grep.py
#
# A very simple coroutine

def grep(pattern):
  print "Looking for %s" % pattern
  while True:
    line = (yield)
    if pattern in line:
         print line,
# Example use
if __name__ == '__main__':
  g = grep("python")
  g.next()
  g.send("Yeah, but no, but yeah, but no")
  g.send("A series of tubes")
  g.send("python generators rock!")

对于生成器函数来说, 有 next() ,send(), throw() 方法.

next() 比较容易理解.
send(arg) 向生成器发送参数, 会被 line = (yield) 捕获.
throw(typ[,val[,tb]]) -> raise exception in generator, return next yielded value or raise StopIteration. 参数分别是, 异常类型, 异常 str, traceback object.

这算是最简单的.... 配合函数修饰符, 相当强大.... yield ...

尝试 git

尝试了下 git ... http://www.projectlocker.com/ 提供免费的 git service, 以及更有名的 github 需要使用 ssh-keygen 生成个 pub key 加到网站里. 某目录下

git init // 初始
git add * // 加入文件/目录
git status // 察看状态
git show
git commit -m "Message here" // commit 提交
git remote add xxxx git://xxxx/xxx.git/   // 加入 remote
git branch   // 分支管理
git branch -r // 察看 remote branch 
git push xxxx master
git fetch // 抓取版本
git checkout // 导出分支内容,切换分支
git-pull = git-fetch + git-merge
等等

2009年4月5日星期日

数据结构=brainfucking.作业.

设计一个算法,用尽可能少的辅助空间将顺序表前 m 个元素和后 n 个元素进行互换. {a1,a2,a3..am,b1,b2,b3..bn} => {b1,b2,b3..bn,a1,a2,a3..am} 让我想起某单词. brainfucking. now let's fucking. 问题简化到 int 数组吧. 差不多. 函数原型:

void swapSeq(int *lst, int m, int n);

惯例, lst 索引为 1..(m n) "用尽可能少的辅助空间"..... 果然 brainfucking. 函数里定义某数组, 把 b1~bn, a1~am 按顺序复制再拿出来, OK. 不好玩. 直觉告诉我们, 既然是 brainfucking..... Show 实现.

void swapSeq(int *lst, int m, int n)
{
   int i,tmp;
   if(m n<= 1)
         return ;
   for(i=1; i<= m && i<= n; i )
     {
         tmp = lst[i];
         lst[i] = lst[m i];
         lst[m i] = tmp;
     }
   if(m> n)
       return swapSeq(lst n, m-n, n);
   if(n> m)
       return swapSeq(lst m, m, n-m);
}

发现可爱的尾递归. OK, 立即抛弃, 上 while~!

void swapSeq(int *lst, int m, int n)
{
   int i,tmp;
   while(m n> 1)
   {
       for(i=1; i<= m && i<= n; i )
         {
             tmp = lst[i];
             lst[i] = lst[m i];
             lst[m i] = tmp;
         }
       if(m> n)
       {
           lst = n;
           m = m-n;
       }
       else if(n> m)
       {
           lst = m;
           n = n-m;
       }
       else
           return ;
   }
}

抛弃题目本质. BT 下. BT 无罪.

void swapSeq(int *lst, int m, int n)
{
   int i;
   if(m n<= 1)
         return ;
   for(i=1; i<= m && i<= n; i )
     {
         lst[i]^= lst[m i];
         lst[m i]^= lst[i];
         lst[i]^= lst[m i];
     }
   if(m> n)
       return swapSeq(lst n, m-n, n);
   if(n> m)
       return swapSeq(lst m, m, n-m);
}

其实递归未优化的消耗还是很大的, 开优化另说.

void swapSeq(int *lst, int m, int n)
{
   int i;
   while(m n> 1)
   {
       for(i=1; i<= m && i<= n; i )
         {
             lst[i]^= lst[m i];
             lst[m i]^= lst[i];
             lst[i]^= lst[m i];
         }
       if(m> n)
       {
           lst = n;
           m = m-n;
       }
       else if(n> m)
       {
           lst = m;
           n = n-m;
       }
       else
           return ;
   }
}

字符计数

Smalltalk(8063446)  8:58:35 
>>> from collections import Counter 
>>> c=Counter() 
>>> for letter in 'here is a sample of english text':
...   c[letter] += 1 
... 
>>> c
Counter({' ': 6, 'e': 5, 's': 3, 'a': 2, 'i': 2, 'h': 2, 'l': 2, 't': 2, 'g': 1, 'f': 1, 'm': 1, 'o': 1, 'n': 1, 'p': 1, 'r': 1, 'x': 1}) >>> c['e'] 5 >>> c['z'] 0 
;; 其实是个 reduce 过程.  
Feather(85660100)  11:45:28 
>>> reduce(lambda d,s: dict(d, **{s:d.get(s,0)+1}), \
            'here is a sample of english text', {}) 
{'a': 2, ' ': 6, 'e': 5, 'g': 1, 'f': 1, 'i': 2, 'h': 2, 'm': 1, 'l': 2, 'o': 1, 'n': 1, 'p': 1, 's': 3, 'r': 1, 't': 2, 'x': 1}

其实要说的是.... dict() 很强大很强大. 以及其他 python type 函数.

2009年2月28日星期六

百度谷歌造[有图有真相][baidu made from google]

有图有真相... 上图:

真相:

#!win32 only. no unix magic chars
# coding=utf-8

# need pywin32 module
import win32com.client

bro = win32com.client.Dispatch("InternetExplorer.Application")
# show window
bro.Visible = True
bro.Navigate("http://www.baidu.com")

while bro.Busy:
    pass

print bro.LocationName
# MSDN: bro.Document is a DOM object
doc = bro.Document
logo = doc.getElementsByTagName('img')[0]
logo.setAttribute('src', 'http://www.google.cn/intl/zh-CN/images/logo_cn.gif')
logo.setAttribute('width', '286')
logo.setAttribute('height', '110')

2009年2月24日星期二

HTTP Basic 验证HTTPBasicAuthHandler

以 fanfou api 为例说明. 有兴趣的可以做个功能完整的工具.


#!/usr/bin/python
# coding=utf-8

import urllib2
import xml.dom.minidom
import winsound
import time
import thread
import urllib

public_timeline_url = "http://api.fanfou.com/statuses/public_timeline.xml"
user_timeline_url = "http://api.fanfou.com/statuses/user_timeline.xml"
friends_timeline_url = "http://api.fanfou.com/statuses/friends_timeline.xml"
statuses_update_url = "http://api.fanfou.com/statuses/update.xml"


# friendship related
friendships_create_url = "http://api.fanfou.com/friendships/create.xml"
friendships_destroy_url = "http://api.fanfou.com/friendships/destroy.xml"

# block list
blocks_create_url = "http://api.fanfou.com/blocks/create.xml"
blocks_destroy_url = "http://api.fanfou.com/blocks/destroy.xml"

passwd_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
passwd_mgr.add_password(realm=None,
                          uri='http://api.fanfou.com/',
                          user='xxxxxxx',
                          passwd='xxxxxxxxx') # 用户名和密码

auth_handler = urllib2.HTTPBasicAuthHandler(passwd_mgr)
opener = urllib2.build_opener(auth_handler)

last_id = ''


def parseDom(doc):
    sts = doc.getElementsByTagName('status')
    get_textValue = lambda tag: \
        (lambda d:d.getElementsByTagName(tag)[0].childNodes[0].data)
    create_time = get_textValue('created_at')
    id_str = get_textValue('id')
    text = get_textValue('text')
    username = get_textValue('name')
    return [(username(node), text(node), create_time(node), id_str(node))
                for node in sts][::-1]

def refresh():
    global last_id
    print last_id
    res = opener.open(friends_timeline_url)
    doc = xml.dom.minidom.parseString(res.read())
    posts = parseDom(doc)
    id_list = [ids for (_, _, _, ids) in posts]
    if last_id in id_list:
        id_list = id_list[id_list.index(last_id)+1:]
        winsound.MessageBeep(winsound.MB_ICONASTERISK)
    print 
    for (user, txt, tm, ids) in posts:
        print "{%s}: %s [%s]" % (user, txt, tm.split()[3])
    last_id = posts[-1][-1]
    print 'MyMsg:',

def refresh_thread(times=0, timeout=60):
    f = False
    if times== 0:
        f = True
    while f or times:
        refresh()
        time.sleep(timeout)
    
def post_statue(msg, reply=''):
    if not msg:
        return
    #msg = unicode(msg, 'gbk')
    data = {'status' : msg,
            'in_reply_to_status_id' : reply,
            'source' : 'ffsh'}
    data = urllib.urlencode(data)
    opener.open(statuses_update_url, data)
  

thread.start_new_thread(refresh_thread, (0,30))

while True:
    mymsg = raw_input('MyMsg: ')
    post_statue( mymsg )

2009年2月22日星期日

用 python 暴百度贴吧

当然, 只是原型, 但基本功能有了. 目前在刷回帖的时候会有乱码, 编码问题需要解决. 成果展示:

---------------------------------------------------------------



#!/usr/bin/python
# -*- coding:utf-8 -*-

import urllib2
import urllib
import socket
import libxml2dom
import Image
import time



def domToQueryDict(doc, formIndex=0):
   data = {}
   form = doc.getElementsByTagName('form')[formIndex]
   nodes = form.getElementsByTagName('input')
   nodes.extend(form.getElementsByTagName('textarea'))
   for node in nodes:
       if node.hasAttribute('name') and node.getAttribute('type')!= u"submit":
           if node.hasAttribute('value'):
               data[node.getAttribute('name')] = \
                   node.getAttribute('value').encode('gbk', 'ignore')
           else:
               data[node.getAttribute('name')] = ""
   return data

             
cookie_handler = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(cookie_handler)
urllib2.install_opener(opener)

# 获得登陆数据
url = "http://passport.baidu.com/?" + \
   "login&tpl=tb&u=http%3A//tieba.baidu.com/f%3Fkw%3Dpython"
doc = libxml2dom.parseString(opener.open(url).read(), html=1,
                            htmlencoding='gbk')
data = domToQueryDict(doc)

# 用户名/密码设置
data[u'username'] = '用户名'
data[u'password'] = '密码'

# 验证图片处理
veryfyPic = opener.open("https://passport.baidu.com/?verifypic")
picfile = file("pic.jpg", 'wb')
picfile.write(veryfyPic.read())
picfile.close()
im = Image.open("pic.jpg")
im.show()
data["verifycode"] = raw_input("What you see?").strip()

# 登陆页面
data = urllib.urlencode(data)
b = opener.open(url, data)  # here login OK
print b.read()
print '-'*20

for i in xrange(20):
   # 分析主页数据
   # url = "http://tieba.baidu.com/f?kz=543530949"
   b = opener.open("http://tieba.baidu.com/f?kw=python&t=1")
   frontpage = b.read()
   doc = libxml2dom.parseString(frontpage, html=1, htmlencoding='gbk')
             
   data = domToQueryDict(doc, 1)
   print "*",
   # 发贴标题及内容
   data[u'ti'] = "Fledna 自爆专用~!!!!!!!!!!!!!!"+str(i)
   data[u'co'] = "For Test only. " + str(i)
   print data[u'ti']
   data = urllib.urlencode(data)
   post_url = "http://tieba.baidu.com/f"
   b = opener.open(post_url, data)
   time.sleep(20)

订阅：博文 (Atom)

reversed("andelf")== "fledna"