'경량화' 태그의 글 목록 (4 Page)

tensor decomposition 간단한 설명

학습된 weight tensor를 더 작은 단위의 vector나 tensor의 곱이나 합의 조합으로 근사적으로 표현하는 것 저장해야하는 weight가 줄어들어 computation이 줄어드는 효과 1. CP decomposition rank one tensor의 P개의 linear combination으로 주어진 tensor를 decomposition할 수 있다는 것 convolution weight tensor x를 vector a,b,c의 outer product(=rank one tensor)의 linear combination(summation)으로 근사적으로 분해함 실제 network에 활용할 때는 일반적으로 full convolution이 image에 filter tensor를 con..

format_list_bulleted light weight modeling
· 2024. 8. 17.
textsms

network compiling 간단하게

1. motivation 학습이 완료된 network를 deploy하려는 target hardware에 inference가 가능하도록 compile하는 작업 최적화 기법도 동반되어 사실상 속도에 가장 큰 영향을 미치는 작업 그러나 가장 복잡하면서 내용이 상당히 어려움 유명 제조사들에서 compile library를 제공함 NVIDIA의 TensorRT는 NVIDIA GPU에 최적화시켜 compile을 수행 Tensorflow의 Tflite는 여러 embedded device에 성능을 보장해줌 apache의 TVM은 Tflite와 비슷한 기능들을 제공? 2. 문제점 compile library마다, 적용하는 모델마다 성능에 차이가 있음 기본적으로 compile을 수행하면 inference 속도..

format_list_bulleted light weight modeling
· 2024. 8. 16.
textsms

network quantization 간단하게

1. motivation 일반적으로 float32로 network 연산과정이 표현되나 그것보다 작은 크기의 데이터 타입인 float16 half precision이나 int8 fixed point로 mapping하여 연산을 수행하는 것 2. 예시 1번처럼 float32의 matrix들을 int8로 quantization mapping하여 표현을함 matrix를 계산한 결과가 2번임 2번을 다시 float32로 dequantization하면 3번이 됨 실제 quantization하지 않고 계산한 4번과 비교하면 어느정도 오차가 있는데 이것을 quantization error라고 부름 경험적으로 quantization error에 대해 robust하게 network가 잘 작동한다는 사실이 알려져서 보편..

format_list_bulleted light weight modeling
· 2024. 8. 16.
textsms

knowledge distillation 간단하게

이미 학습된 큰 규모의 teacher network가 있다면 작은 student network 학습시 teacher network의 지식을 전달하여 학습을 시키자. 1. 일반적인 방법 주어진 input x를 pretrained teacher model과 student model에 넣어서 output을 낸다 teacher model의 경우 softmax(T=t)를 사용하여 soft label을 내놓고 student model은 softmax(T=1)의 hard label과 softmax(T=t)의 soft label을 모두 내놓는다 A부분에서는 student model의 hard prediction을 이용하여 ground truth와의 cross entropy를 이용한 일반적인 training이 이루..

format_list_bulleted light weight modeling
· 2024. 8. 15.
textsms

Efficient Architecture design이란

1. motivation 효율적인 architecture를 디자인하여 큰 모델 못지 않은 성능을 내는 모듈을 만들고자 하는 것이 efficient architecture design 최근 trend는 사람이 디자인하는 것보다 AutoML이나 Neural Architecture Search같은 컴퓨터가 optimization으로 모델을 찾게 만듦 2. 필요성 매일 다양한 특성을 가지는 module들이 쏟아져나오고 있음 이 module들은 특징이 다양함. parameter가 적은데 성능은 좋다든지 성능만 좋고 parameter는 너무 많다든지 연산량이 적은데 성능이 떨어진다든지 왼쪽 그림은 연산횟수에 따른 정확도 비교. 원의 크기는 model의 parameter 오른쪽 그림은 model의 param..

format_list_bulleted light weight modeling
· 2024. 8. 14.
textsms

왜 경량화인가? 딥러닝 모델의 경량화가 필요한 이유

1. motivation 머신러닝과 딥러닝은 이제 거의 모든 분야에서 활용되고 있음 자율주행자동차, entertainment, healthcare, NLP, text, speech, image, audio 등등 다양한 application 2. on device AI smartphone, smartwatch, IoT device 등에 자체적으로 머신러닝이나 딥러닝 어플리케이션이 올라가 inference를 수행함 이미 object detection, translation 등이 on device에서 수행가능한 딥러닝 어플리케이션들 그러나 올려야하는 어플리케이션이 power(battery) usage가 적어야하고 RAM memory usage가 적어야하고 storage가 적어야하고 computing p..

format_list_bulleted light weight modeling
· 2024. 8. 14.
textsms

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

tensor decomposition 간단한 설명

network compiling 간단하게

network quantization 간단하게

knowledge distillation 간단하게

Efficient Architecture design이란

왜 경량화인가? 딥러닝 모델의 경량화가 필요한 이유

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역