하스켈 성능 튜닝 2

하스켈 성능 튜닝 2

목 차1. 상황 요약2. 성능 개선3. 새로운 방법4. 결론

예시로 들었던 문제온라인 저지 사이트 ALGOSPOT

문제 ID: WEIRD (https://www.algospot.com/judge/problem/read/WEIRD)

In mathematics, weird numbers are natural numbers that are abundant but not semiperfect. In other words, a natural number N is a weird number if and only if:

• Sum of its proper divisors (i.e. less than N ) is greater than the number.

• No subset of its divisors sum to N.

For example, the set of proper divisors of 12 is { 1, 2, 3, 4, 6 } . The sum of these numbers exceed 12, however, 12 is not a weird number since 1 + 2 + 3 + 6 = 12.

However, 70 is a weird number since its proper divisors are {1, 2, 5, 7, 10, 14, 35} and no subset sums to 70 .

Write a program to determine if the given numbers are weird or not.

https://www.algospot.com/judge/problem/read/WEIRD

결국 풀었다1 년 걸림

이전에 썼던 알고리즘• S(N) 을 알면 abundant 검사는 단순하다• semiperfect 검사는 0-1 knapsack 문제• 약수들은 정수이므로 다이나믹 프로그래밍으로 해결 가능

• 의 까지 써서 만들 수 있는 , j 이하의 최대 합

• 이면 N 은 semiperfect

불리언 문제로 변환• S(N) 을 알면 abundant 검사는 단순하다• semiperfect 검사는 0-1 knapsack 문제• 약수들은 정수이므로 다이나믹 프로그래밍으로 해결 가능

• 의 까지 써서 만들 수 있는 최대 합이 j 인가 ?

• 이면 N 은 semiperfect

이전 0-1 KNAPSACK 코드Knapsack :: Int -> Uarray Int Int -> Int -> UArray Int Int

knapsack n divs len = a 0 (runSTUArray $ newArray (0,n) (0::Int)) where

a prevRow prevA

| prevRow == len = prevA

| True = currA `seq` a (prevRow+1) currA where

currA = runSTUArray $ newListArray (0,n) [w `seq` (v `seq` v) | w <- [0..n],

let ith = divs ! prevRow, let v = maximum [at w, ith + at (w-ith)]]

at w = if w < 0 then -n else prevA ! w

필요없는 배열이 바로 가비지 컬렉션되지만 , 매 호출마다 새 배열을 생성한다

수정된 0-1 KNAPSACK 코드knapsack :: Int -> [Int] -> Boolknapsack n divs = let ary = runSTUArray $ do ary <- newArray (1,n) False writeArray ary 1 True forM_ divs \i -> do forM_ [n,n-1..i+1] $ \j -> do x <- readArray ary j y <- readArray ary (j-i) writeArray ary j (x || y) writeArray ary i True

return ary in ary ! n

• 약수들의 배열에서 1 은 제외(base case 로 직접 처리 )

• 단 하나의 배열 ary 사용• C 코드처럼 보인다• IO 모나드 안도 아닌데 ary 가 변경 가능 ?

• knapsack 은 순수 함수가 맞다• Rank-2 type 덕에 가능

• 자세한 설명은 다음 기회에…

수정된 0-1 KNAPSACK 코드knapsack :: Int -> [Int] -> Boolknapsack n divs = let ary = runSTUArray $ do ary <- newArray (1,n) False writeArray ary 1 True forM_ divs \i -> do forM_ [n,n-1..i+1] $ \j -> do x <- readArray ary j y <- readArray ary (j-i) writeArray ary j (x || y) writeArray ary i True

만약 (ary ! n = True) 이면 여기서 종료return ary

in ary ! n

• forM, forM_ 은 무조건 모든 액션을 수행한다• 일부 약수만으로 N 을 완성할 수 있으면더 계산할 필요가 없다• C 의 break 같은 구문은 없나 ?

• MonadPlus 의 guard 가 그런 역할• 하지만 forM 은 Monad 에 대해 정의됨• forM_ :: Monad m => [a] -> (a -> m b) -> m ()

• class (Monad m) => MonadPlus m

• guard 적용 불가

수정된 0-1 KNAPSACK 코드for :: Monad m => [a] -> (b -> Bool) -> (a -> m b) -> m [b]for [] _ _ = return []for (x:xs) test f = f x >>= \y -> if test y then for xs test f >>= \ys -> return (y:ys) else return []

knapsack :: Int -> [Int] -> Boolknapsack n divs = let ary = runSTUArray $ do ary <- newArray (1,n) False writeArray ary 1 True for divs not \i -> do form_ [n,n-1..i+1] $ \j -> do x <- readArray ary j y <- readArray ary (j-i) writeArray ary j (x || y) writeArray ary i True fin <- readArray ary n return False return ary in ary ! n

• 중간에 멈출 수 있는 for 를 직접 구현• 액션을 실행할 때마다 모나드 내부의 값을 추출한다• 추출한 값에 test 함수 적용한 결과가 False면 정지

• 루프 조기 탈출을 구현

성능을 측정해보자• 또다시 N = 500000 에 대해 테스트• 메모리 사용 18MB -> 1MB

• 실행 8.94 초 -> 0.19 초

이건 통과를 안 할 수가 없다

응 아냐

모든 수단을 활용했다• 철저한 평가 (Strict Evaluation)

• 초고속 계산을 위한 불리언 원시 타입• 불필요한 값 생성 방지• 탐색 공간 줄이기

계산 테이블 재고N = 12 1 2 3 4 5 6 7 8 9 10 11 12

s1 = 1 1 1 1 1 1 1 1 1 1 1 1 1

s2 = 2 1 2 3 3 3 3 3 3 3 3 3 3

s3 = 3 1 2 3 4 5 6 6 6 6 6 6 6

s4 = 4 1 2 3 4 5 6 7 8 9 10 10 10

s5 = 6 1 2 3 4 5 6 7 8 9 10 11 12

• N = 12 인 경우 총 계산 : 6 * 12 = 72 칸 / 실제 필요한 계산 : 23 칸• N = 18 인 경우 총 계산 : 5 * 18 = 90 칸 / 실제 필요한 계산 : 25 칸

알고리즘 대격변knapsack2 n divs = f [n] divs where

f xs [] = any (== 0) xs

f xs (d:ds) =

let ys = filter (>= 0) (map (\x -> x - d) xs)

in f (ys ++ xs) ds

• 약수 리스트에 다시 1 포함• 약수는 큰 것부터 계산 (knapsack2 를 호출할 때 약수 리스트를 뒤집어서 전달 )

• 그런데 중복 원소가 있으면 ?

• 그냥 중복 계산 – 걸러내는 게 더 오래 걸린다

알고리즘 대격변knapsack2 n divs = f [n] divs where

f xs [] = any (== 0) xs

f xs (d:ds) =

let ys = filter (>= 0) (map (\x -> x - d) xs)

in if (any (== 0) ys) then True else f (ys ++ xs) ds

• 역시 조기 탈출 구현

알고리즘 대격변knapsack2 n divs = f [n] divs (sum divs) where

f xs [] _ = any (== 0) xs

f xs (d:ds) lim =

let ys = filter (\y -> 0 <= y && y <= lim) (map (\x -> x - d) xs)

in if (any (== 0) ys) then True else f (ys ++ xs) ds (lim-d)

• 분기 한정 (Branch and Bound)

• 의 부분합으로 M 을 만들어야 하는데 < M 이면 애초에 만들 수가 없다

SUPER FASTmain = do

forM_ [2..500000] $ \i -> do

when (weird i) (print i)

• 2 에서 50 만까지 전수 검사해도 6.68 초• 메모리 사용량은 14MB

통과

그런데• 철저한 평가 (Strict Evaluation)

• 표현식이 너무 길어지기 전에 계산을 강제로 수행• 원시 타입 (Unboxed type)

• 정수 표현에 C 와 같은 바이트 사용• 테이블의 두 행만 메모리에 유지

• 필요없는 행은 바로 가비지 컬렉션이 되도록

다 안씀

결론

그냥 알고리즘을 잘 짜자

Software

하스켈 성능 튜닝 2