Index | index by Group | index by Distribution | index by Vendor | index by creation date | index by Name | Mirrors | Help | Search |
Name: ollama | Distribution: openSUSE Tumbleweed |
Version: 0.12.6 | Vendor: openSUSE |
Release: 1.1 | Build date: Sat Oct 18 07:33:15 2025 |
Group: Unspecified | Build host: reproducible |
Size: 126488305 | Source RPM: ollama-0.12.6-1.1.src.rpm |
Packager: http://bugs.opensuse.org | |
Url: https://ollama.com | |
Summary: Tool for running AI models on-premise |
Ollama is a tool for running AI models on one's own hardware. It offers a command-line interface and a RESTful API. New models can be created or existing ones modified in the Ollama library using the Modelfile syntax. Source model weights found on Hugging Face and similar sites can be imported.
MIT
* Sat Oct 18 2025 Glen Masgai <glen.masgai@gmail.com> - Update vendored golang.org/x/net/html to v0.46.0 - Update to version 0.12.6 * Experimental Vulkan support * Ollama's app now supports searching when running DeepSeek-V3.1, Qwen3 and other models that support tool calling. * Flash attention is now enabled by default for Gemma 3, improving performance and memory utilization * Fixed issue where Ollama would hang while generating responses * Fixed issue where qwen3-coder would act in raw mode when using /api/generate or ollama run qwen3-coder <prompt> * Fixed qwen3-embedding providing invalid results * Ollama will now evict models correctly when num_gpu is set * Fixed issue where tool_index with a value of 0 would not be sent to the model - Add ollama user to render group * Sat Oct 11 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update vendored golang.org/x/net/html to v0.45.0 [boo#1251413] [CVE-2025-47911] [boo#1241757] [CVE-2025-22872] - Update to version 0.12.5: * Fixed issue where "think": false would show an error instead of being silently ignored * Fixed deepseek-r1 output issues - Update to version 0.12.4: * Flash attention is now enabled by default for Qwen 3 and Qwen 3 Coder * Fixed an issue where keep_alive in the API would accept different values for the /api/chat and /api/generate endpoints * Fixed tool calling rendering with qwen3-coder * More reliable and accurate VRAM detection * OLLAMA_FLASH_ATTENTION can now be overridden to 0 for models that have flash attention enabled by default * Fixed crash where templates were not correctly defined * Sat Oct 04 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.12.3: * New models: DeepSeek-V3.1-Terminus, Kimi K2-Instruct-0905 * Fixed issue where tool calls provided as stringified JSON would not be parsed correctly * ollama push will now provide a URL to follow to sign in * Fixed issues where qwen3-coder would output unicode characters incorrectly * Fix issue where loading a model with /load would crash - Update to version 0.12.2: * A new web search API is now available in Ollama * Models with Qwen3's architecture including MoE now run in Ollama's new engine * Fixed issue where built-in tools for gpt-oss were not being rendered correctly * Support multi-regex pretokenizers in Ollama's new engine * Ollama's new engine can now load tensors by matching a prefix or suffix - Update to version 0.12.1: * New model: Qwen3 Embedding: state of the art open embedding model by the Qwen team * Qwen3-Coder now supports tool calling * Fixed issue where Gemma3 QAT models would not output correct tokens * Fix issue where & characters in Qwen3-Coder would not be parsed correctly when function calling * Fixed issues where ollama signin would not work properly - Update to version 0.12.0: * Cloud models are now available in preview * Models with the Bert architecture now run on Ollama's engine * Models with the Qwen 3 architecture now run on Ollama's engine * Fixed issue where models would not be imported correctly with ollama create * Ollama will skip parsing the initial <think> if provided in the prompt for /api/generate - Update to version 0.11.11: * Improved memory usage when using gpt-oss * Fixed error that would occur when attempting to import safetensor files * Improved memory estimates for hybrid and recurrent models * Fixed error that would occur when when batch size was greater than context length * Flash attention & KV cache quantization validation fixes * Add dimensions field to embed requests * Enable new memory estimates in Ollama's new engine by default * Ollama will no longer load split vision models in the Ollama engine * Tue Sep 09 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.11.10: * Added support for EmbeddingGemma, a new open embedding model - Update to version 0.11.9: * Improved performance via overlapping GPU and CPU computations - Update to version 0.11.8: * gpt-oss now has flash attention enabled by default for systems that support it * Improved load times for gpt-oss * Mon Aug 25 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.11.7: * DeepSeek-V3.1 is now available to run via Ollama. * Fixed issue where multiple models would not be loaded on CPU-only systems * Ollama will now work with models who skip outputting the initial <think> tag (e.g. DeepSeek-V3.1) * Fixed issue where text would be emitted when there is no opening <think> tag from a model * Fixed issue where tool calls containing { or } would not be parsed correctly - Update to version 0.11.6: * Improved performance when using flash attention * Fixed boundary case when encoding text using BPE - Update to version 0.11.5: * Performance improvements for the gpt-oss models * Improved memory management for scheduling models on GPUs, leading to better VRAM utilization, model performance and less out of memory errors. These new memory estimations can be enabled with OLLAMA_NEW_ESTIMATES=1 ollama serve and will soon be enabled by default. * Improved multi-GPU scheduling and reduced VRAM allocation when using more than 2 GPUs * Fix error when parsing bad harmony tool calls * OLLAMA_FLASH_ATTENTION=1 will also enable flash attention for pure-CPU models * Fixed OpenAI-compatible API not supporting reasoning_effort * Thu Aug 07 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.11.4: * openai: allow for content and tool calls in the same message * openai: when converting role=tool messages, propagate the tool name * openai: always provide reasoning * Bug fixes * Wed Aug 06 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.11.0: * New model: OpenAI gpt-oss 20B and 120B * Quantization - MXFP4 format * Tue Aug 05 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.10.1: * No notable changes. - Update to version 0.10.0: * ollama ps will now show the context length of loaded models * Improved performance in gemma3n models by 2-3x * Parallel request processing now defaults to 1 * Fixed issue where tool calling would not work correctly with granite3.3 and mistral-nemo models * Fixed issue where Ollama's tool calling would not work correctly if a tool's name was part of of another one, such as add and get_address * Improved performance when using multiple GPUs by 10-30% * Ollama's OpenAI-compatible API will now support WebP images * Fixed issue where ollama show would report an error * ollama run will more gracefully display errors * Thu Jul 03 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.9.5: * No notable changes. - Update to version 0.9.4: * The directory in which models are stored can now be modified. * Tool calling with empty parameters will now work correctly * Fixed issue when quantizing models with the Gemma 3n architecture - Update to version 0.9.3: * Ollama now supports Gemma 3n * Ollama will now limit context length to what the model was trained against to avoid strange overflow behavior - Update to version 0.9.2: * Fixed issue where tool calls without parameters would not be returned correctly * Fixed does not support generate errors * Fixed issue where some special tokens would not be tokenized properly for some model architectures * Tue Jun 17 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.9.1: * Tool calling reliability and performance has been improved for the following models: Magistral Llama 4 Mistral DeepSeek-R1-2508 * Magistral now supports disabling thinking mode * Error messages that previously showed POST predict will now be more informative * Sat May 31 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.9.0: * Ollama now has the ability to enable or disable thinking. This gives users the flexibility to choose the model’s thinking behavior for different applications and use cases. - Update to version 0.8.0: * Ollama will now stream responses with tool calls * Logs will now include better memory estimate debug information when running models in Ollama's engine. - Update to version 0.7.1: * Improved model memory management to allocate sufficient memory to prevent crashes when running multimodal models in certain situations * Enhanced memory estimation for models to prevent unintended memory offloading * ollama show will now show ... when data is truncated * Fixed crash that would occur with qwen2.5vl * Fixed crash on Nvidia's CUDA for llama3.2-vision * Support for Alibaba's Qwen 3 and Qwen 2 architectures in Ollama's new multimodal engine * Fri May 23 2025 Wolfgang Engel <wolfgang.engel@suse.com> - Cleanup part in spec file where build for SLE-15-SP6 and above is defined to make if condition more robust * Wed May 21 2025 Wolfgang Engel <wolfgang.engel@suse.com> - Allow to build for Package Hub for SLE-15-SP7 (openSUSE:Backports:SLE-15-SP7) with g++-12/gcc-12 by checking for sle_version >= 150600 in spec file (bsc#1243438) * Sat May 17 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.7.0: * Ollama now supports multimodal models via Ollama’s new engine, starting with new vision multimodal models: ~ Meta Llama 4 ~ Google Gemma 3 ~ Qwen 2.5 VL ~ Qwen 2.5 VL * Ollama now supports providing WebP images as input to multimodal models * Improved performance of importing safetensors models via ollama create * Various bug fixes and performance enhancements * Tue May 06 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.6.8: * Performance improvements for Qwen 3 MoE models on NVIDIA and AMD GPUs * Fixed a memory leak that occurred when providing images as input * ollama show will now correctly label older vision models such as llava * Reduced out of memory errors by improving worst-case memory estimations * Fix issue that resulted in a context canceled error - Update to version 0.6.7: * New model: Qwen 3 * New model: Phi 4 reasoning and Phi 4 mini reasoning * New model: llama 4 * Increased default context window to 4096 tokens * Fixed issue where image paths would not be recognized with ~ when being provided to ollama run * Improved output quality when using JSON mode in certain scenarios * Fixed issue where model would be stuck in the Stopping... state - Use source url (https://en.opensuse.org/SourceUrls) * Thu Apr 24 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.6.6: * New model: IBM Granite 3.3 * New model: DeepCoder * New, faster model downloading: OLLAMA_EXPERIMENT=client2 ollama serve will run Ollama using a new downloader with improved performance and reliability when running ollama pull * Fixed memory leak issues when running Gemma 3, Mistral Small 3.1 and other models on Ollama * Improved performance of ollama create when importing models from Safetensors * Ollama will now allow tool function parameters with either a single type or an array of types * Fixed certain out-of-memory issues caused by not reserving enough memory at startup * Fixed nondeterministic model unload order * Included the items and $defs fields to properly handle array types in the API * OpenAI-Beta headers are now included in the CORS safelist * Fixed issue where model tensor data would be corrupted when importing models from Safetensors * Sat Apr 19 2025 Eyad Issa <eyadlorenzo@gmail.com> - Add ollama to the video group - Update to version 0.6.5: * Add support for mistral-small * Fix issues with spm tokenizer for Gemma 3 models * Add checks for values falling out of sliding window cache * Improve file descriptor management for tensors and Pull operations * Add gfx1200 & gfx1201 GPU support on Linux * Optimize sliding window attention and KV cache implementations * Implement loading tensors in 32KiB chunks for better performance * Add autotemplate for gemma3 models * Add benchmarking for ollama server performance * Fix file handling in /proc/cpuinfo discovery * Support heterogeneous KV cache layer sizes in memory estimation * Fix debug logging for memory estimates * Improve error handling for empty logits and tensor data reading * Return model capabilities from the show endpoint * Tue Mar 25 2025 me@levitati.ng - Update to version 0.6.2: * Multiple images are now supported in Gemma 3 * Fixed issue where running Gemma 3 would consume a large amount of system memory * ollama create --quantize now works when converting Gemma 3 from safetensors * Fixed issue where /save would not work if running a model with / in the name * Add support for AMD Strix Halo GPUs * Tue Mar 18 2025 Bernhard Wiedemann <bwiedemann@suse.com> - Only require git-core * Fri Mar 14 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update BuildRequires to go1.24 - Update to version 0.6.0: * New model: Gemma 3 * Fixed error that would occur when running snowflake-arctic-embed and snowflake-arctic-embed2 models * Various performance improvements and bug fixes * Wed Mar 12 2025 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.5.13: * New models: Phi-4-Mini, Granite-3.2-Vision, Command R7B Arabic * The default context length can now be set with a new OLLAMA_CONTEXT_LENGTH environment variable. For example, to set the default context length to 8K, use: OLLAMA_CONTEXT_LENGTH=8192 ollama serve * Fixed issue where bf16 GGUF files could not be imported * Ollama is now be able to accept requests from Visual Studio Code and Cursor by accepting requests from origins beginning with vscode-file:// * Various performance improvements and bug fixes * Thu Feb 27 2025 eyadlorenzo@gmail.com - Update to version 0.5.12: * New model: Perplexity R1 1776 * The OpenAI-compatible API will now return tool_calls if the model called a tool * Performance on certain Intel Xeon processors should now be restored * Fixed permission denied issues after installing Ollama on Linux * Fixed issue where additional CPU libraries were included in the arm64 Linux install * The progress bar will no longer flicker when running ollama pull * Fixed issue where running a model would fail on Linux if Ollama was installed in a path with UTF-8 characters * X-Stainless-Timeout will now be accepted as a header in the OpenAI API endpoints * Sat Feb 15 2025 Eyad Issa <eyadlorenzo@gmail.com> - Use Ninja instead of Make and update the build script to match the new version - Update to version 0.5.11: * No notable changes for Linux - Update to version 0.5.10: * Fixed issue on multi-GPU Windows and Linux machines where memory estimations would be incorrect - Update to version 0.5.9: * New model: DeepScaleR * New model: OpenThinker - Update to version 0.5.8: * Ollama will now use AVX-512 instructions where available for additional CPU acceleration * Fixed indexing error that would occur when downloading a model with ollama run or ollama pull * Fixes cases where download progress would reverse * Mon Jan 27 2025 Adrian Schröter <adrian@suse.de> - Make ollama configurable by the admin via /etc/sysconfig/ollama (boo#1236008) - cleanup reproducible.patch * Thu Jan 16 2025 Eyad Issa <eyadlorenzo@gmail.com> - Removed 01-build-verbose.patch: embedded GOFLAG into .spec file - Disabled reproducible.patch: should be not needed, as .gz is not produced anymore - Update to version 0.5.7: * Fixed issue where using two FROM commands in Modelfile * Support importing Command R and Command R+ architectures from safetensors - Update to version 0.5.6: * Fixed errors that would occur when running ollama create on Windows and when using absolute paths - Update to version 0.5.5: * New models: ~ Phi-4 ~ Command R7B ~ DeepSeek-V3 ~ OLMo 2 ~ Dolphin 3 ~ SmallThinker: ~ Granite 3.1 Dense ~ Granite 3.1 MoE * The /api/create API endpoint that powers ollama create has been changed to improve conversion time and also accept a JSON object. * Fixed runtime error that would occur when filling the model's context window * Fixed crash that would occur when quotes were used in /save * Fixed errors that would occur when sending x-stainless headers from OpenAI clients - Update to version 0.5.4: * New model: Falcon3 * Fixed issue where providing null to format would result in an error - Update to version 0.5.3: * Fixed runtime errors on older Intel Macs * Fixed issue where setting the format field to "" would cause an error - Update to version 0.5.2: * New model: EXAONE 3.5 * Fixed issue where whitespace would get trimmed from prompt when images were provided * Improved memory estimation when scheduling models * OLLAMA_ORIGINS will now check hosts in a case insensitive manner * Thu Dec 12 2024 Bernhard Wiedemann <bwiedemann@suse.com> - Add reproducible.patch for deterministic .gz creation (boo#1047218) * Sat Dec 07 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.5.1: * Fixed issue where Ollama's API would generate JSON output when specifying "format": null * Fixed issue where passing --format json to ollama run would cause an error - Update to version 0.5.0: * New models: ~ Llama 3.3: a new state of the art 70B model. ~ Snowflake Arctic Embed 2: Snowflake's frontier embedding model. * Ollama now supports structured outputs, making it possible to constrain a model's output to a specific format defined by a JSON schema. The Ollama Python and JavaScript libraries have been updated to support structured outputs, together with Ollama's OpenAI-compatible API endpoints. * Fixed error importing model vocabulary files * Experimental: new flag to set KV cache quantization to 4-bit (q4_0), 8-bit (q8_0) or 16-bit (f16). This reduces VRAM requirements for longer context windows. - Update to version 0.4.7: * Enable index tracking for tools - openai api support (#7888) * llama: fix typo and formatting in readme (#7876) * readme: add SpaceLlama, YouLama, and DualMind to community integrations (#7216) * Sat Nov 30 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.4.6: * New model: QwQ: an experimental research model by the Qwen team, focused on advancing AI reasoning capabilities. * Tool calls will now be included in streaming responses * Ollama will now provide an error when submitting SVG images * Image tokens will no longer be counted in token counts when running a text-only model - Update to version 0.4.5: * The Ollama Python Library has been updated * Fixed issue where HTTPS_PROXY and HTTP_PROXY environment variables would have no effect * Ollama will now accept X-Stainless-Retry-Count used by many OpenAI API clients * Fix issue where importing certain GGUF files would result in the incorrect quantization level * ollama push will now print the uploaded model URL on ollama.com - Update to version 0.4.4: * Marco-o1: An open large reasoning model for real-world solutions by the Alibaba International Digital Commerce Group (AIDC-AI). * Fixed issue where Ollama would freeze when processing requests in parallel (e.g. when using code completion tools) * Redirecting output to a file no longer outputs progress bars or spinners - Update to version 0.4.3: * New model: Tülu 3 is a leading instruction following model family, offering fully open-source data, code, and recipes by the The Allen Institute for AI. * New model: Mistral Large: a new version of Mistral Large with improved Long Context, Function Calling and System Prompt support. * Improved performance issues that occurred in Ollama versions 0.4.0-0.4.2 * Fixed issue that would cause granite3-dense to generate empty responses * Fixed crashes and hanging caused by KV cache management * Sat Nov 16 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.4.2: * runner.go: Propagate panics back to the user. * runner.go: Increase survivability of main processing loop * build: fix arm container image (#7674) * add line numbers for parser errors (#7326) * chore(deps): bump golang.org/x dependencies (#7655) * runner.go: Don't trim whitespace from inputs * runner.go: Enforce NUM_PARALLEL directly in the runner * cmd: preserve exact bytes when displaying template/system layers (#7586) * fix(mllama): sync backend between batches * runner.go: Fix off-by-one for num predicted * CI: give windows lint more time (#7635) * Jetpack support for Go server (#7217) * doc: capture numeric group requirement (#6941) * docs: Capture docker cgroup workaround (#7519) * runner.go: Make KV entry accounting more robust * readme: add aichat terminal app to community integrations (#7418) * api: fix typos in Go Doc comments (#7620) * readme: add GoLamify to community integrations (#7521) * readme: add browser extension that enables using Ollama for interacting with web pages (#5827) * docs: add mentions of Llama 3.2 (#7517) * api: fix typo in python ClientFromEnvironment docs (#7604) * readme: add llama3.2-vision to model list (#7580) * Mon Nov 11 2024 Eyad Issa <eyadlorenzo@gmail.com> - Add patch 01-build-verbose.patch to add the -v option to go build - Update to version 0.4.1: * runner.go: Check for zero length images * docs: update langchainpy.md with proper model name (#7527) * Set macos min version for all architectures (#7579) * win: remove preview title from installer (#7529) * Workaround buggy P2P ROCm copy on windows (#7466) * Debug logging for nvcuda init (#7532) * Align rocm compiler flags (#7467) * Be explicit for gpu library link dir (#7560) * docs: OLLAMA_NEW_RUNNERS no longer exists * runner.go: Remove unused arguments * sched: Lift parallel restriction for multimodal models except mllama * Thu Nov 07 2024 adrian@suse.de - Update to version 0.4.0: * Update README.md (#7516) * One corrupt manifest should not wedge model operations (#7515) * prompt: Use a single token when estimating mllama context size * readme: add Hexabot to the list of community integrations * Quiet down debug log of image payload (#7454) * Wed Nov 06 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.4.0-rc8: * CI: Switch to v13 macos runner (#7498) * CI: matrix strategy fix (#7496) * Sign windows arm64 official binaries (#7493) * readme: add TextCraft to community integrations (#7377) * nvidia libs have inconsistent ordering (#7473) * CI: omit unused tools for faster release builds (#7432) * llama: Improve error handling * runner.go: Only allocate 1 element embedding batches for mllama * refactor kv estimation * mllama cross attention * Add basic mllama integration tests (#7455) * runner.go: Don't set cross attention before sending embeddings * Give unicode test more time to run (#7437) * Fri Nov 01 2024 Eyad Issa <eyadlorenzo@gmail.com> - Remove enable-lto.patch - Update to version 0.4.0-rc6: * Refine default thread selection for NUMA systems (#7322) * runner.go: Better abstract vision model integration * Soften windows clang requirement (#7428) * Remove submodule and shift to Go server - 0.4.0 (#7157) * Move windows app out of preview (#7347) * windows: Support alt install paths, fit and finish (#6967) * add more tests for getting the optimal tiled canvas (#7411) * Switch windows to clang (#7407) * tests: Add test for Unicode processing * runner.go: Better handle return NULL values from llama.cpp * add mllama image processing to the generate handler (#7384) * Bump to latest Go 1.22 patch (#7379) * Fix deepseek deseret regex (#7369) * Better support for AMD multi-GPU on linux (#7212) * Fix unicode output on windows with redirect to file (#7358) * Fix incremental build file deps (#7361) * Improve dependency gathering logic (#7345) * fix #7247 - invalid image input (#7249) * integration: harden embedding test (#7306) * default to "FROM ." if a Modelfile isn't present (#7250) * Fix rocm windows build and clean up dependency gathering (#7305) * runner.go: Merge partial unicode characters before sending * readme: add Ollama for Swift to the community integrations (#7295) * server: allow vscode-webview origin (#7273) * image processing for llama3.2 (#6963) * llama: Decouple patching script from submodule (#7139) * llama: add compiler tags for cpu features (#7137) * Wed Oct 30 2024 Alessandro de Oliveira Faria <cabelo@opensuse.org> - Update to version 0.3.14: * New Models + Granite 3 MoE: The IBM Granite 1B and 3B models are the first mixture of experts (MoE) Granite models from IBM designed for low latency usage. + Granite 3 Dense: The IBM Granite 2B and 8B models are designed to support tool-based use cases and support for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing. * Sat Oct 12 2024 eyadlorenzo@gmail.com - Update to version 0.3.13: * New safety models: ~ Llama Guard 3: a series of models by Meta, fine-tuned for content safety classification of LLM inputs and responses. ~ ShieldGemma: ShieldGemma is set of instruction tuned models from Google DeepMind for evaluating the safety of text prompt input and text output responses against a set of defined safety policies. * Fixed issue where ollama pull would leave connections when encountering an error * ollama rm will now stop a model if it is running prior to deleting it * Sat Sep 28 2024 Alessandro de Oliveira Faria <cabelo@opensuse.org> - Update to version 0.3.12: * Llama 3.2: Meta's Llama 3.2 goes small with 1B and 3B models. * Qwen 2.5 Coder: The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing. * Ollama now supports ARM Windows machines * Fixed rare issue where Ollama would report a missing .dll file on Windows * Fixed performance issue for Windows without GPUs * Fri Sep 20 2024 adrian@suse.de - Update to version 0.3.11: * llm: add solar pro (preview) (#6846) * server: add tool parsing support for nemotron-mini (#6849) * make patches git am-able * CI: dist directories no longer present (#6834) * CI: clean up naming, fix tagging latest (#6832) * CI: set platform build build_linux script to keep buildx happy (#6829) * readme: add Agents-Flex to community integrations (#6788) * fix typo in import docs (#6828) * readme: add vim-intelligence-bridge to Terminal section (#6818) * readme: add Obsidian Quiz Generator plugin to community integrations (#6789) * Fix incremental builds on linux (#6780) * Use GOARCH for build dirs (#6779) * Optimize container images for startup (#6547) * examples: updated requirements.txt for privategpt example * examples: polish loganalyzer example (#6744) * readme: add ollama_moe to community integrations (#6752) * runner: Flush pending responses before returning * add "stop" command (#6739) * refactor show ouput * readme: add QodeAssist to community integrations (#6754) * Verify permissions for AMD GPU (#6736) * add *_proxy for debugging * docs: update examples to use llama3.1 (#6718) * Quiet down dockers new lint warnings (#6716) * catch when model vocab size is set correctly (#6714) * readme: add crewAI to community integrations (#6699) * readme: add crewAI with mesop to community integrations * Tue Sep 17 2024 adrian@suse.de - Update to version 0.3.10: * openai: align chat temperature and frequency_penalty options with completion (#6688) * docs: improve linux install documentation (#6683) * openai: don't scale temperature or frequency_penalty (#6514) * readme: add Archyve to community integrations (#6680) * readme: add Plasmoid Ollama Control to community integrations (#6681) * Improve logging on GPU too small (#6666) * openai: fix "presence_penalty" typo and add test (#6665) * Fix gemma2 2b conversion (#6645) * Document uninstall on windows (#6663) * Revert "Detect running in a container (#6495)" (#6662) * llm: make load time stall duration configurable via OLLAMA_LOAD_TIMEOUT * Introduce GPU Overhead env var (#5922) * Detect running in a container (#6495) * readme: add AiLama to the list of community integrations (#4957) * Update gpu.md: Add RTX 3050 Ti and RTX 3050 Ti (#5888) * server: fix blob download when receiving a 200 response (#6656) * readme: add Gentoo package manager entry to community integrations (#5714) * Update install.sh:Replace "command -v" with encapsulated functionality (#6035) * readme: include Enchanted for Apple Vision Pro (#4949) * readme: add lsp-ai to community integrations (#5063) * readme: add ollama-php library to community integrations (#6361) * readme: add vnc-lm discord bot community integration (#6644) * llm: use json.hpp from common (#6642) * readme: add confichat to community integrations (#6378) * docs: add group to manual Linux isntructions and verify service is running (#6430) * readme: add gollm to the list of community libraries (#6099) * readme: add Cherry Studio to community integrations (#6633) * readme: add Go fun package (#6421) * docs: fix spelling error (#6391) * install.sh: update instructions to use WSL2 (#6450) * readme: add claude-dev to community integrations (#6630) * readme: add PyOllaMx project (#6624) * llm: update llama.cpp commit to 8962422 (#6618) * Use cuda v11 for driver 525 and older (#6620) * Log system memory at info (#6617) * readme: add Painting Droid community integration (#5514) * readme: update Ollama4j link and add link to Ollama4j Web UI (#6608) * Fix sprintf to snprintf (#5664) * readme: add PartCAD tool to readme for generating 3D CAD models using Ollama (#6605) * Reduce docker image size (#5847) * readme: add OllamaFarm project (#6508) * readme: add go-crew and Ollamaclient projects (#6583) * docs: update faq.md for OLLAMA_MODELS env var permissions (#6587) * fix(cmd): show info may have nil ModelInfo (#6579) * docs: update GGUF examples and references (#6577) * Add findutils to base images (#6581) * remove any unneeded build artifacts * doc: Add Nix and Flox to package manager listing (#6074) * update the openai docs to explain how to set the context size (#6548) * fix(test): do not clobber models directory * add llama3.1 chat template (#6545) * update deprecated warnings * validate model path * throw an error when encountering unsupport tensor sizes (#6538) * Move ollama executable out of bin dir (#6535) * update templates to use messages * more tokenizer tests * add safetensors to the modelfile docs (#6532) * Fix import image width (#6528) * Update manual instructions with discrete ROCm bundle (#6445) * llm: fix typo in comment (#6530) * adjust image sizes * clean up convert tokenizer * detect chat template from configs that contain lists * update the import docs (#6104) * server: clean up route names for consistency (#6524) * Only enable numa on CPUs (#6484) * gpu: Group GPU Library sets by variant (#6483) * update faq * passthrough OLLAMA_HOST path to client * convert safetensor adapters into GGUF (#6327) * gpu: Ensure driver version set before variant (#6480) * llm: Align cmake define for cuda no peer copy (#6455) * Fix embeddings memory corruption (#6467) * llama3.1 * convert gemma2 * create bert models from cli * bert * Split rocm back out of bundle (#6432) * CI: remove directories from dist dir before upload step (#6429) * CI: handle directories during checksum (#6427) * Fix overlapping artifact name on CI * Review comments * Adjust layout to bin+lib/ollama * Remove Jetpack * Add windows cuda v12 + v11 support * Enable cuda v12 flags * Add cuda v12 variant and selection logic * Report GPU variant in log * Add Jetson cuda variants for arm * Wire up ccache and pigz in the docker based build * Refactor linux packaging * server: limit upload parts to 16 (#6411) * Fix white space. * Reset NumCtx. * Override numParallel only if unset. * fix: chmod new layer to 0o644 when creating it * fix: Add tooltip to system tray icon * only skip invalid json manifests * skip invalid manifest files * fix noprune * add `CONTRIBUTING.md` (#6349) * Fix typo and improve readability (#5964) * server: reduce max connections used in download (#6347) * update chatml template format to latest in docs (#6344) * lint * Update openai.md to remove extra checkbox (#6345) * llama3.1 memory * Thu Aug 15 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.3.6: * Fixed issue where /api/embed would return an error instead of loading the model when the input field was not provided. * ollama create can now import Phi-3 models from Safetensors * Added progress information to ollama create when importing GGUF files * Ollama will now import GGUF files faster by minimizing file copies - Update to version 0.3.6: * Fixed issue where temporary files would not be cleaned up * Fix rare error when Ollama would start up due to invalid model data * Sun Aug 11 2024 Alessandro de Oliveira Faria <cabelo@opensuse.org> - Update to version 0.3.4: * New embedding models - BGE-M3: a large embedding model from BAAI distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity. - BGE-Large: a large embedding model trained in english. - Paraphrase-Multilingual: A multilingual embedding model trained on parallel data for 50+ languages. * New embedding API with batch support - Ollama now supports a new API endpoint /api/embed for embedding generation: * This API endpoint supports new features: - Batches: generate embeddings for several documents in one request - Normalized embeddings: embeddings are now normalized, improving similarity results - Truncation: a new truncate parameter that will error if set to false - Metrics: responses include load_duration, total_duration and prompt_eval_count metrics * Sat Aug 03 2024 eyadlorenzo@gmail.com - Update to version 0.3.3: * The /api/embed endpoint now returns statistics: total_duration, load_duration, and prompt_eval_count * Added usage metrics to the /v1/embeddings OpenAI compatibility API * Fixed issue where /api/generate would respond with an empty string if provided a context * Fixed issue where /api/generate would return an incorrect value for context * /show modefile will now render MESSAGE commands correctly - Update to version 0.3.2: * Fixed issue where ollama pull would not resume download progress * Fixed issue where phi3 would report an error on older versions * Tue Jul 30 2024 Adrian Schröter <adrian@suse.de> - Update to version 0.3.1: * Added support for min_p sampling option * Lowered number of requests required when downloading models with ollama pull * ollama create will now autodetect required stop parameters when importing certain models * Fixed issue where /save would cause parameters to be saved incorrectly. * OpenAI-compatible API will now return a finish_reason of tool_calls if a tool call occured. * Mon Jul 29 2024 Adrian Schröter <adrian@suse.de> - fix build on leap 15.6 - exclude builds on 32bit due to build failures * Sun Jul 28 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.3.0: * Ollama now supports tool calling with popular models such as Llama 3.1. This enables a model to answer a given prompt using tool(s) it knows about, making it possible for models to perform more complex tasks or interact with the outside world. * New models: ~ Llama 3.1 ~ Mistral Large 2 ~ Firefunction v2 ~ Llama-3-Groq-Tool-Use * Fixed duplicate error message when running ollama create * Wed Jul 24 2024 adrian@suse.de - Update to version 0.2.8: * api embed docs (#5282) * convert: capture `head_dim` for mistral (#5818) * Update llama.cpp submodule commit to `d94c6e0c` (#5805) * server: collect nested tool call objects when parsing (#5824) * Remove no longer supported max vram var * Refine error reporting for subprocess crash * Remove out of space test temporarily (#5825) * llm: consider `head_dim` in llama arch (#5817) * Adjust windows ROCm discovery * add patch for tekken (#5807) * preserve last assistant message (#5802) * Fix generate test flakyness (#5804) * server: validate template (#5734) * OpenAI: Function Based Testing (#5752) * adjust openai chat msg processing (#5729) * fix parsing tool calls * server: check for empty tools array too (#5779) * always provide content even if empty (#5778) * server: only parse tool calls if tools are provided (#5771) * Fix context exhaustion integration test for small gpus * Refine scheduler unit tests for reliability * Thu Jul 18 2024 Eyad Issa <eyadlorenzo@gmail.com> - Fixed issue with shared libraries * Thu Jul 18 2024 Eyad Issa <eyadlorenzo@gmail.com> - Added %check section - Use -v when building - Update to version 0.2.6: * New models: MathΣtral is a 7B model designed for math reasoning and scientific discovery by Mistral AI. * Fixed issue where uppercase roles such as USER would no longer work in the chat endpoints * Fixed issue where empty system message would be included in the prompt * Sun Jul 14 2024 eyadlorenzo@gmail.com - Update to version 0.2.5: * Fixed issue where a model's SYSTEM message not be applied - Update to version 0.2.4: * Fixed issue where context, load_duration and total_duration fields would not be set in the /api/generate endpoint. * Ollama will no longer error if loading models larger than system memory if disk space is available - Update to version 0.2.3: * Fix issue where system prompt would not be applied - Update to version 0.2.2: * Fixed errors that occurred when using Ollama with Nvidia V100 GPUs * glm4 models will no longer fail to load from out of memory errors * Fixed error that would occur when running deepseek-v2 and deepseek-coder-v2 models * Fixed a series of out of memory issues when using Nvidia GPUs * Fixed a series of errors that would occur when using multiple Radeon GPUs - Update to version 0.2.1: * Fixed issue where setting OLLAMA_NUM_PARALLEL would cause models to be reloaded after each request - Update to version 0.2.0: * Ollama 0.2.0 is now available with concurrency support. This unlocks 2 specific features: ~ Ollama can now serve multiple requests at the same time ~ Ollama now supports loading different models at the same time * New models: GLM-4: A strong multi-lingual general language model with competitive performance to Llama 3. * New models: CodeGeeX4: A versatile model for AI software development scenarios, including code completion. * New models: Gemma 2: Improved output quality and base text generation models now available * Ollama will now show a better error if a model architecture isn't supported * Improved handling of quotes and spaces in Modelfile FROM lines * Ollama will now return an error if the system does not have enough memory to run a model on Linux * Sun Jul 07 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.1.48: * Fixed issue where Gemma 2 would continuously output when reaching context limits * Fixed out of memory and core dump errors when running Gemma 2 * /show info will now show additional model information in ollama run * Fixed issue where ollama show would result in an error on certain vision models - Update to version 0.1.48: * Added support for Google Gemma 2 models (9B and 27B) * Fixed issues with ollama create when importing from Safetensors * Mon Jun 24 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.1.46: * Docs (#5149) * fix: quantization with template * Fix use_mmap parsing for modelfiles * Refine mmap default logic on linux * Bump latest fedora cuda repo to 39 * Sat Jun 22 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.1.45: * New models: DeepSeek-Coder-V2: A 16B & 236B open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. * ollama show <model> will now show model information such as context window size * Model loading on Windows with CUDA GPUs is now faster * Setting seed in the /v1/chat/completions OpenAI compatibility endpoint no longer changes temperature * Enhanced GPU discovery and multi-gpu support with concurrency * Introduced a workaround for AMD Vega RX 56 SDMA support on Linux * Fix memory prediction for deepseek-v2 and deepseek-coder-v2 models * api/show endpoint returns extensive model metadata * GPU configuration variables are now reported in ollama serve * Update Linux ROCm to v6.1.1 * Tue Jun 18 2024 Eyad Issa <eyadlorenzo@gmail.com> - Added documentation files to .spec - Update to version 0.1.44: * Fixed issue where unicode characters such as emojis would not be loaded correctly when running ollama create * Fixed certain cases where Nvidia GPUs would not be detected and reported as compute capability 1.0 devices - Update to version 0.1.43: * New import.md guide for converting and importing models to Ollama * Fixed issue where embedding vectors resulting from /api/embeddings would not be accurate * JSON mode responses will no longer include invalid escape characters * Removing a model will no longer show incorrect File not found errors * Fixed issue where running ollama create would result in an error on Windows with certain file formatting - Update to version 0.1.42: * New models: Qwen 2: a new series of large language models from Alibaba group * Qwen 2: a new series of large language models from Alibaba group * ollama pull is now faster if it detects a model is already downloaded * ollama create will now automatically detect prompt templates for popular model architectures such as Llama, Gemma, Phi and more. * Ollama can now be accessed from local apps built with Electron and Tauri, as well as in developing apps in local html files * Update welcome prompt in Windows to llama3 * Fixed issues where /api/ps and /api/tags would show invalid timestamps in responses - Update to version 0.1.41: * Fixed issue on Windows 10 and 11 with Intel CPUs with integrated GPUs where Ollama would encounter an error * Sat Jun 01 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.1.40: * New model: Codestral: Codestral is Mistral AI’s first-ever code model designed for code generation tasks. * New model: IBM Granite Code: now in 3B and 8B parameter sizes. * New model: Deepseek V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model * Fixed out of memory and incorrect token issues when running Codestral on 16GB Macs * Fixed issue where full-width characters (e.g. Japanese, Chinese, Russian) were deleted at end of the line when using ollama run * Wed May 29 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.1.39: * New model: Cohere Aya 23: A new state-of-the-art, multilingual LLM covering 23 different languages. * New model: Mistral 7B 0.3: A new version of Mistral 7B with initial support for function calling. * New model: Phi-3 Medium: a 14B parameters, lightweight, state-of-the-art open model by Microsoft. * New model: Phi-3 Mini 128K and Phi-3 Medium 128K: versions of the Phi-3 models that support a context window size of 128K * New model: Granite code: A family of open foundation models by IBM for Code Intelligence * It is now possible to import and quantize Llama 3 and its finetunes from Safetensors format to Ollama. * Full changelog at https://github.com/ollama/ollama/releases/tag/v0.1.39 * Wed May 22 2024 Eyad Issa <eyadlorenzo@gmail.com> - Added 15.6 build * Thu May 16 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.1.38: * New model: Falcon 2: A new 11B parameters causal decoder-only model built by TII and trained over 5T tokens. * New model: Yi 1.5: A new high-performing version of Yi, now licensed as Apache 2.0. Available in 6B, 9B and 34B sizes. * Added ollama ps command * Added /clear command * Fixed issue where switching loaded models on Windows would take several seconds * Running /save will no longer abort the chat session if an incorrect name is provided * The /api/tags API endpoint will now correctly return an empty list [] instead of null if no models are provided * Sun May 12 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.1.37: * Fixed issue where models with uppercase characters in the name would not show with ollama list * Fixed usage string for ollama create * Fix finish_reason being "" instead of null in the Open-AI compatible chat API. * Sun May 12 2024 Eyad Issa <eyadlorenzo@gmail.com> - Use obs_scm service instead of the deprecated tar_scm - Use zstd for vendor tarball compression * Sun May 12 2024 Eyad Issa <eyadlorenzo@gmail.com> - Update to version 0.1.36: * Fixed exit status 0xc0000005 error with AMD graphics cards on Windows * Fixed rare out of memory errors when loading a model to run with CPU - Update to version 0.1.35: * New models: Llama 3 ChatQA: A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG). * Quantization: ollama create can now quantize models when importing them using the --quantize or -q flag * Fixed issue where inference subprocesses wouldn't be cleaned up on shutdown. * Fixed a series out of memory errors when loading models on multi-GPU systems * Ctrl+J characters will now properly add newlines in ollama run * Fixed issues when running ollama show for vision models * OPTIONS requests to the Ollama API will no longer result in errors * Fixed issue where partially downloaded files wouldn't be cleaned up * Added a new done_reason field in responses describing why generation stopped responding * Ollama will now more accurately estimate how much memory is available on multi-GPU systems especially when running different models one after another - Update to version 0.1.34: * New model: Llava Llama 3 * New model: Llava Phi 3 * New model: StarCoder2 15B Instruct * New model: CodeGemma 1.1 * New model: StableLM2 12B * New model: Moondream 2 * Fixed issues with LLaVa models where they would respond incorrectly after the first request * Fixed out of memory errors when running large models such as Llama 3 70B * Fixed various issues with Nvidia GPU discovery on Linux and Windows * Fixed a series of Modelfile errors when running ollama create * Fixed no slots available error that occurred when cancelling a request and then sending follow up requests * Improved AMD GPU detection on Fedora * Improved reliability when using the experimental OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED flags * ollama serve will now shut down quickly, even if a model is loading - Update to version 0.1.33: * New model: Llama 3 * New model: Phi 3 Mini * New model: Moondream * New model: Llama 3 Gradient 1048K * New model: Dolphin Llama 3 * New model: Qwen 110B * Fixed issues where the model would not terminate, causing the API to hang. * Fixed a series of out of memory errors on Apple Silicon Macs * Fixed out of memory errors when running Mixtral architecture models * Aded experimental concurrency features: ~ OLLAMA_NUM_PARALLEL: Handle multiple requests simultaneously for a single model ~ OLLAMA_MAX_LOADED_MODELS: Load multiple models simultaneously * Tue Apr 23 2024 rrahl0@disroot.org - Update to version 0.1.32: * scale graph based on gpu count * Support unicode characters in model path (#3681) * darwin: no partial offloading if required memory greater than system * update llama.cpp submodule to `7593639` (#3665) * fix padding in decode * Revert "cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470)" (#3662) * Added Solar example at README.md (#3610) * Update langchainjs.md (#2030) * Added MindsDB information (#3595) * examples: add more Go examples using the API (#3599) * Update modelfile.md * Add llama2 / torch models for `ollama create` (#3607) * Terminate subprocess if receiving `SIGINT` or `SIGTERM` signals while model is loading (#3653) * app: gracefully shut down `ollama serve` on windows (#3641) * types/model: add path helpers (#3619) * update llama.cpp submodule to `4bd0f93` (#3627) * types/model: make ParseName variants less confusing (#3617) * types/model: remove (*Digest).Scan and Digest.Value (#3605) * Fix rocm deps with new subprocess paths * mixtral mem * Revert "types/model: remove (*Digest).Scan and Digest.Value (#3589)" * types/model: remove (*Digest).Scan and Digest.Value (#3589) * types/model: remove DisplayLong (#3587) * types/model: remove MarshalText/UnmarshalText from Digest (#3586) * types/model: init with Name and Digest types (#3541) * server: provide helpful workaround hint when stalling on pull (#3584) * partial offloading * refactor tensor query * api: start adding documentation to package api (#2878) * examples: start adding Go examples using api/ (#2879) * Handle very slow model loads * fix: rope * Revert "build.go: introduce a friendlier way to build Ollama (#3548)" (#3564) * build.go: introduce a friendlier way to build Ollama (#3548) * update llama.cpp submodule to `1b67731` (#3561) * ci: use go-version-file * Correct directory reference in macapp/README (#3555) * cgo quantize * no blob create if already exists * update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528) * Docs: Remove wrong parameter for Chat Completion (#3515) * no rope parameters * add command-r graph estimate * Fail fast if mingw missing on windows * use an older version of the mac os sdk in release (#3484) * Add test case for context exhaustion * CI missing archive * fix dll compress in windows building * CI subprocess path fix * Fix CI release glitches * update graph size estimate * Fix macOS builds on older SDKs (#3467) * cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470) * feat: add OLLAMA_DEBUG in ollama server help message (#3461) * Revert options as a ref in the server * default head_kv to 1 * fix metal gpu * Bump to b2581 * Refined min memory from testing * Release gpu discovery library after use * Safeguard for noexec * Detect too-old cuda driver * Integration test improvements * Apply 01-cache.diff * Switch back to subprocessing for llama.cpp * Simplify model conversion (#3422) * fix generate output * update memory calcualtions * refactor model parsing * Add chromem-go to community integrations (#3437) * Update README.md (#3436) * Community Integration: CRAG Ollama Chat (#3423) * Update README.md (#3378) * Community Integration: ChatOllama (#3400) * Update 90_bug_report.yml * Add gemma safetensors conversion (#3250) * CI automation for tagging latest images * Bump ROCm to 6.0.2 patch release * CI windows gpu builds * Update troubleshooting link * fix: trim quotes on OLLAMA_ORIGINS - add set_version to automatically switch over to the newer version * Tue Apr 16 2024 bwiedemann@suse.com - Update to version 0.1.31: * Backport MacOS SDK fix from main * Apply 01-cache.diff * fix: workflows * stub stub * mangle arch * only generate on changes to llm subdirectory * only generate cuda/rocm when changes to llm detected * Detect arrow keys on windows (#3363) * add license in file header for vendored llama.cpp code (#3351) * remove need for `$VSINSTALLDIR` since build will fail if `ninja` cannot be found (#3350) * change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) * malformed markdown link (#3358) * Switch runner for final release job * Use Rocky Linux Vault to get GCC 10.2 installed * Revert "Switch arm cuda base image to centos 7" * Switch arm cuda base image to centos 7 * Bump llama.cpp to b2527 * Fix ROCm link in `development.md` * adds ooo to community integrations (#1623) * Add cliobot to ollama supported list (#1873) * Add Dify.AI to community integrations (#1944) * enh: add ollero.nvim to community applications (#1905) * Add typechat-cli to Terminal apps (#2428) * add new Web & Desktop link in readme for alpaca webui (#2881) * Add LibreChat to Web & Desktop Apps (#2918) * Add Community Integration: OllamaGUI (#2927) * Add Community Integration: OpenAOE (#2946) * Add Saddle (#3178) * tlm added to README.md terminal section. (#3274) * Update README.md (#3288) * Update README.md (#3338) * Integration tests conditionally pull * add support for libcudart.so for CUDA devices (adds Jetson support) * llm: prevent race appending to slice (#3320) * Bump llama.cpp to b2510 * Add Testcontainers into Libraries section (#3291) * Revamp go based integration tests * rename `.gitattributes` * Bump llama.cpp to b2474 * Add docs for GPU selection and nvidia uvm workaround * doc: faq gpu compatibility (#3142) * Update faq.md * Better tmpdir cleanup * Update faq.md * update `faq.md` * dyn global * llama: remove server static assets (#3174) * add `llm/ext_server` directory to `linguist-vendored` (#3173) * Add Radeon gfx940-942 GPU support * Wire up more complete CI for releases * llm,readline: use errors.Is instead of simple == check (#3161) * server: replace blob prefix separator from ':' to '-' (#3146) * Add ROCm support to linux install script (#2966) * .github: fix model and feature request yml (#3155) * .github: add issue templates (#3143) * fix: clip memory leak * Update README.md * add `OLLAMA_KEEP_ALIVE` to environment variable docs for `ollama serve` (#3127) * Default Keep Alive environment variable (#3094) * Use stdin for term discovery on windows * Update ollama.iss * restore locale patch (#3091) * token repeat limit for prediction requests (#3080) * Fix iGPU detection for linux * add more docs on for the modelfile message command (#3087) * warn when json format is expected but not mentioned in prompt (#3081) * Adapt our build for imported server.cpp * Import server.cpp as of b2356 * refactor readseeker * Add docs explaining GPU selection env vars * chore: fix typo (#3073) * fix gpu_info_cuda.c compile warning (#3077) * use `-trimpath` when building releases (#3069) * relay load model errors to the client (#3065) * Update troubleshooting.md * update llama.cpp submodule to `ceca1ae` (#3064) * convert: fix shape * Avoid rocm runner and dependency clash * fix `03-locale.diff` * Harden for deps file being empty (or short) * Add ollama executable peer dir for rocm * patch: use default locale in wpm tokenizer (#3034) * only copy deps for `amd64` in `build_linux.sh` * Rename ROCm deps file to avoid confusion (#3025) * add `macapp` to `.dockerignore` * add `bundle_metal` and `cleanup_metal` funtions to `gen_darwin.sh` * tidy cleanup logs * update llama.cpp submodule to `77d1ac7` (#3030) * disable gpu for certain model architectures and fix divide-by-zero on memory estimation * Doc how to set up ROCm builds on windows * Finish unwinding idempotent payload logic * update llama.cpp submodule to `c2101a2` (#3020) * separate out `isLocalIP` * simplify host checks * add additional allowed hosts * Update docs `README.md` and table of contents * add allowed host middleware and remove `workDir` middleware (#3018) * decode ggla * convert: fix default shape * fix: allow importing a model from name reference (#3005) * update llama.cpp submodule to `6cdabe6` (#2999) * Update api.md * Revert "adjust download and upload concurrency based on available bandwidth" (#2995) * cmd: tighten up env var usage sections (#2962) * default terminal width, height * Refined ROCm troubleshooting docs * Revamp ROCm support * update go to 1.22 in other places (#2975) * docs: Add LLM-X to Web Integration section (#2759) * fix some typos (#2973) * Convert Safetensors to an Ollama model (#2824) * Allow setting max vram for workarounds * cmd: document environment variables for serve command * Add Odin Runes, a Feature-Rich Java UI for Ollama, to README (#2440) * Update api.md * Add NotesOllama to Community Integrations (#2909) * Added community link for Ollama Copilot (#2582) * use LimitGroup for uploads * adjust group limit based on download speed * add new LimitGroup for dynamic concurrency * refactor download run * Wed Mar 06 2024 computersemiexpert@outlook.com - Update to version 0.1.28: * Fix embeddings load model behavior (#2848) * Add Community Integration: NextChat (#2780) * prepend image tags (#2789) * fix: print usedMemory size right (#2827) * bump submodule to `87c91c07663b707e831c59ec373b5e665ff9d64a` (#2828) * Add ollama user to video group * Add env var so podman will map cuda GPUs * Tue Feb 27 2024 Jan Engelhardt <jengelh@inai.de> - Edit description, answer _what_ the package is and use nominal phrase. (https://en.opensuse.org/openSUSE:Package_description_guidelines) * Fri Feb 23 2024 Loren Burkholder <computersemiexpert@outlook.com> - Added the Ollama package - Included a systemd service
/usr/bin/ollama /usr/lib/ollama /usr/lib/ollama/libggml-base.so /usr/lib/ollama/libggml-cpu.so /usr/lib/ollama/libggml-vulkan.so /usr/lib/ollama/libvulkan.so.1 /usr/lib/ollama/libvulkan.so.1.4.328 /usr/lib/systemd/system/ollama.service /usr/lib/sysusers.d/ollama-user.conf /usr/lib64/ollama /usr/lib64/ollama/libggml-cpu.so /usr/lib64/ollama/libggml-vulkan.so /usr/share/doc/packages/ollama /usr/share/doc/packages/ollama/README.md /usr/share/doc/packages/ollama/api.md /usr/share/doc/packages/ollama/cloud.md /usr/share/doc/packages/ollama/development.md /usr/share/doc/packages/ollama/docker.md /usr/share/doc/packages/ollama/examples.md /usr/share/doc/packages/ollama/faq.md /usr/share/doc/packages/ollama/gpu.md /usr/share/doc/packages/ollama/images /usr/share/doc/packages/ollama/images/ollama-keys.png /usr/share/doc/packages/ollama/images/signup.png /usr/share/doc/packages/ollama/import.md /usr/share/doc/packages/ollama/linux.md /usr/share/doc/packages/ollama/macos.md /usr/share/doc/packages/ollama/modelfile.md /usr/share/doc/packages/ollama/openai.md /usr/share/doc/packages/ollama/template.md /usr/share/doc/packages/ollama/troubleshooting.md /usr/share/doc/packages/ollama/windows.md /usr/share/fillup-templates/sysconfig.ollama /usr/share/licenses/ollama /usr/share/licenses/ollama/LICENSE /var/lib/ollama
Generated by rpm2html 1.8.1
Fabrice Bellet, Thu Oct 23 23:06:42 2025