1fbf3585d6 
								
							 
						 
						
							
							
								
								Relay default values to llama runner ( #672 )  
							
							... 
							
							
							
							* include seed in params for llama.cpp server and remove empty filter for temp
* relay default predict options to llama.cpp
- reorganize options to match predict request for readability
* omit empty stop
---------
Co-authored-by: hallh <hallh@users.noreply.github.com> 
							
						 
						
							2023-10-02 14:53:16 -04:00  
				
					
						
							
							
								 
						
							
								a1b2d95f96 
								
							 
						 
						
							
							
								
								remove unused push/pull params ( #650 )  
							
							
							
						 
						
							2023-09-29 17:27:19 -04:00  
				
					
						
							
							
								 
						
							
								f40b3de758 
								
							 
						 
						
							
							
								
								use int64 consistently  
							
							
							
						 
						
							2023-09-28 11:07:24 -07:00  
				
					
						
							
							
								 
						
							
								f221637053 
								
							 
						 
						
							
							
								
								first pass at linux gpu support ( #454 )  
							
							... 
							
							
							
							* linux gpu support
* handle multiple gpus
* add cuda docker image (#488 )
---------
Co-authored-by: Michael Yang <mxyng@pm.me> 
							
						 
						
							2023-09-12 11:04:35 -04:00  
				
					
						
							
							
								 
						
							
								790d24eb7b 
								
							 
						 
						
							
							
								
								add show command ( #474 )  
							
							
							
						 
						
							2023-09-06 11:04:17 -07:00  
				
					
						
							
							
								 
						
							
								0f541a0367 
								
							 
						 
						
							
							
								
								s/ListResponseModel/ModelResponse/  
							
							
							
						 
						
							2023-08-31 09:47:10 -04:00  
				
					
						
							
							
								 
						
							
								42998d797d 
								
							 
						 
						
							
							
								
								subprocess llama.cpp server ( #401 )  
							
							... 
							
							
							
							* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm 
							
						 
						
							2023-08-30 16:35:03 -04:00  
				
					
						
							
							
								 
						
							
								8bbff2df98 
								
							 
						 
						
							
							
								
								add model IDs ( #439 )  
							
							
							
						 
						
							2023-08-28 20:50:24 -07:00  
				
					
						
							
							
								 
						
							
								f723bf0879 
								
							 
						 
						
							
							
								
								ignore nil map values  
							
							
							
						 
						
							2023-08-17 15:50:46 -07:00  
				
					
						
							
							
								 
						
							
								f27bc261cf 
								
							 
						 
						
							
							
								
								s/parmeter/parameter/  
							
							
							
						 
						
							2023-08-10 16:26:06 -07:00  
				
					
						
							
							
								 
						
							
								81d8d7b73f 
								
							 
						 
						
							
							
								
								fix could not convert int  
							
							
							
						 
						
							2023-08-10 16:24:17 -07:00  
				
					
						
							
							
								 
						
							
								be989d89d1 
								
							 
						 
						
							
							
								
								Token auth ( #314 )  
							
							
							
						 
						
							2023-08-10 11:34:25 -07:00  
				
					
						
							
							
								 
						
							
								4b3507f036 
								
							 
						 
						
							
							
								
								embeddings endpoint  
							
							... 
							
							
							
							Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com> 
							
						 
						
							2023-08-10 11:45:57 -04:00  
				
					
						
							
							
								 
						
							
								7a5f3616fd 
								
							 
						 
						
							
							
								
								embed text document in modelfile  
							
							
							
						 
						
							2023-08-09 10:26:19 -04:00  
				
					
						
							
							
								 
						
							
								21ddcaa1f1 
								
							 
						 
						
							
							
								
								pr comments  
							
							... 
							
							
							
							- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload 
							
						 
						
							2023-08-08 13:49:37 -04:00  
				
					
						
							
							
								 
						
							
								f2074ed4c0 
								
							 
						 
						
							
							
								
								Merge pull request  #306  from jmorganca/default-keep-system  
							
							... 
							
							
							
							automatically set num_keep if num_keep < 0 
							
						 
						
							2023-08-08 09:25:34 -07:00  
				
					
						
							
							
								 
						
							
								8713ac23a8 
								
							 
						 
						
							
							
								
								allow overriding `template` and `system` in `/api/generate`  
							
							... 
							
							
							
							Fixes  #297 
Fixes  #296  
						
							2023-08-08 00:55:34 -04:00  
				
					
						
							
							
								 
						
							
								4dc5b117dd 
								
							 
						 
						
							
							
								
								automatically set num_keep if num_keep < 0  
							
							... 
							
							
							
							num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions 
							
						 
						
							2023-08-07 16:19:12 -07:00  
				
					
						
							
							
								 
						
							
								b9f4d67554 
								
							 
						 
						
							
							
								
								configurable rope frequency parameters  
							
							
							
						 
						
							2023-08-03 22:11:58 -07:00  
				
					
						
							
							
								 
						
							
								1c5a8770ee 
								
							 
						 
						
							
							
								
								read runner parameter options from map  
							
							... 
							
							
							
							- read runner options from map to see what was specified explicitly and overwrite zero values 
							
						 
						
							2023-08-01 13:38:19 -04:00  
				
					
						
							
							
								 
						
							
								528bafa585 
								
							 
						 
						
							
							
								
								cache loaded model  
							
							
							
						 
						
							2023-08-01 11:24:18 -04:00  
				
					
						
							
							
								 
						
							
								184ad8f057 
								
							 
						 
						
							
							
								
								allow specifying stop conditions in modelfile  
							
							
							
						 
						
							2023-07-28 11:02:04 -04:00  
				
					
						
							
							
								 
						
							
								822a0e36eb 
								
							 
						 
						
							
							
								
								lower batch size to 512  
							
							
							
						 
						
							2023-07-28 10:56:21 -04:00  
				
					
						
							
							
								 
						
							
								fadf75f99d 
								
							 
						 
						
							
							
								
								add stop conditions  
							
							
							
						 
						
							2023-07-27 17:00:47 -07:00  
				
					
						
							
							
								 
						
							
								ad3a7d0e2c 
								
							 
						 
						
							
							
								
								add NumGQA  
							
							
							
						 
						
							2023-07-27 14:05:11 -07:00  
				
					
						
							
							
								 
						
							
								688661ab9b 
								
							 
						 
						
							
							
								
								increase default batch size to 1024  
							
							
							
						 
						
							2023-07-27 16:51:01 -04:00  
				
					
						
							
							
								 
						
							
								cca61181cb 
								
							 
						 
						
							
							
								
								sample metrics  
							
							
							
						 
						
							2023-07-27 09:31:44 -07:00  
				
					
						
							
							
								 
						
							
								c490416189 
								
							 
						 
						
							
							
								
								lock on llm.lock(); decrease batch size  
							
							
							
						 
						
							2023-07-27 09:31:44 -07:00  
				
					
						
							
							
								 
						
							
								f62a882760 
								
							 
						 
						
							
							
								
								add session expiration  
							
							
							
						 
						
							2023-07-27 09:31:44 -07:00  
				
					
						
							
							
								 
						
							
								3003fc03fc 
								
							 
						 
						
							
							
								
								update predict code  
							
							
							
						 
						
							2023-07-27 09:31:44 -07:00  
				
					
						
							
							
								 
						
							
								32aec66e6a 
								
							 
						 
						
							
							
								
								add load duration  
							
							
							
						 
						
							2023-07-27 09:31:44 -07:00  
				
					
						
							
							
								 
						
							
								35af37a2cb 
								
							 
						 
						
							
							
								
								session id  
							
							
							
						 
						
							2023-07-27 09:31:44 -07:00  
				
					
						
							
							
								 
						
							
								4c1caa3733 
								
							 
						 
						
							
							
								
								download models when creating from modelfile  
							
							
							
						 
						
							2023-07-25 14:25:13 -04:00  
				
					
						
							
							
								 
						
							
								4cb42ca55e 
								
							 
						 
						
							
							
								
								add copy command ( #191 )  
							
							
							
						 
						
							2023-07-24 11:27:28 -04:00  
				
					
						
							
							
								 
						
							
								9f6e97865c 
								
							 
						 
						
							
							
								
								allow pushing/pulling to insecure registries ( #157 )  
							
							
							
						 
						
							2023-07-21 15:42:19 -07:00  
				
					
						
							
							
								 
						
							
								7ba1308595 
								
							 
						 
						
							
							
								
								Merge pull request  #147  from jmorganca/brucemacd/cli-err-display  
							
							... 
							
							
							
							Improve CLI error display 
							
						 
						
							2023-07-21 16:10:19 +02:00  
				
					
						
							
							
								 
						
							
								e7a393de54 
								
							 
						 
						
							
							
								
								add rm command for models ( #151 )  
							
							
							
						 
						
							2023-07-20 16:09:23 -07:00  
				
					
						
							
							
								 
						
							
								ebaa33ac28 
								
							 
						 
						
							
							
								
								display gin api errors in cli  
							
							
							
						 
						
							2023-07-20 20:45:12 +02:00  
				
					
						
							
							
								 
						
							
								68df36ae50 
								
							 
						 
						
							
							
								
								fix pull 0 bytes on completed layer  
							
							
							
						 
						
							2023-07-18 19:38:11 -07:00  
				
					
						
							
							
								 
						
							
								5bea29f610 
								
							 
						 
						
							
							
								
								add new list command ( #97 )  
							
							
							
						 
						
							2023-07-18 09:09:45 -07:00  
				
					
						
							
							
								 
						
							
								2fb52261ad 
								
							 
						 
						
							
							
								
								basic distribution w/ push/pull ( #78 )  
							
							... 
							
							
							
							* basic distribution w/ push/pull
* add the parser
* add create, pull, and push
* changes to the parser, FROM line, and fix commands
* mkdirp new manifest directories
* make `blobs` directory if it does not exist
* fix go warnings
* add progressbar for model pulls
* move model struct
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> 
							
						 
						
							2023-07-16 17:02:22 -07:00  
				
					
						
							
							
								 
						
							
								965f9ad033 
								
							 
						 
						
							
							
								
								Merge pull request  #77  from jmorganca/mem  
							
							... 
							
							
							
							continue conversation 
							
						 
						
							2023-07-14 14:57:42 -07:00  
				
					
						
							
							
								 
						
							
								5fefaa5d4d 
								
							 
						 
						
							
							
								
								fix typo  
							
							
							
						 
						
							2023-07-14 10:47:18 -07:00  
				
					
						
							
							
								 
						
							
								1775647f76 
								
							 
						 
						
							
							
								
								continue conversation  
							
							... 
							
							
							
							feed responses back into the llm 
							
						 
						
							2023-07-13 17:13:00 -07:00  
				
					
						
							
							
								 
						
							
								05e08d2310 
								
							 
						 
						
							
							
								
								return more info in generate response  
							
							
							
						 
						
							2023-07-13 09:37:32 -07:00  
				
					
						
							
							
								 
						
							
								fd4792ec56 
								
							 
						 
						
							
							
								
								call llama.cpp directly from go  
							
							
							
						 
						
							2023-07-11 11:59:18 -07:00  
				
					
						
							
							
								 
						
							
								a3ec1ec2a0 
								
							 
						 
						
							
							
								
								consistent error handling for pull and generate  
							
							
							
						 
						
							2023-07-10 21:34:15 -07:00  
				
					
						
							
							
								 
						
							
								edba935d67 
								
							 
						 
						
							
							
								
								return error in generate response  
							
							
							
						 
						
							2023-07-10 13:30:10 -07:00  
				
					
						
							
							
								 
						
							
								2d49197b3b 
								
							 
						 
						
							
							
								
								increase default model size to 512  
							
							
							
						 
						
							2023-07-10 21:24:41 +02:00  
				
					
						
							
							
								 
						
							
								f5e2e150b8 
								
							 
						 
						
							
							
								
								allow overriding default generate options  
							
							
							
						 
						
							2023-07-10 20:58:02 +02:00