AI (artificial intelligence)

APTI

Well-Known Member
Joined
Dec 20, 2022
Messages
1,102
Reaction score
887
Credits
9,437
I have decided I am fed up with current public AI and making my own. I am using docker and ollama with openwebui. I am hoping somebody in here has already run into this and can direct me. I would like to find a helpful group much like this forum that is dedicated to the AI like I described. I want to avoid the toxic environments like reddit and discord. Anybody that has suggestions please post.
 


I have installed ollama on my gaming PC but I use it very seldom and only for coding front-ends (which I hate) with qwen3-coder (30b). I have ollama directly on the machine, and I have the tarball downloaded directly from their website and installed on my user directory (not system-wide). I also installed the ROCm extension, the same, from the webites' tarball.

As a front-end I use Continue on VS Codium, which is an opensource coding agent that works similarly to a GitHub copilot.

I'd say that it should be as straightforward to build as openwebui would be, as ollama accepts all the requests as HTTP requests on 11434 port. As long as your openwebui installation can see that port, the rest would be to download the models you'd like.

Are you stuck at any particular point, or are you still about to try?
 
I have installed ollama on my gaming PC but I use it very seldom and only for coding front-ends (which I hate) with qwen3-coder (30b). I have ollama directly on the machine, and I have the tarball downloaded directly from their website and installed on my user directory (not system-wide). I also installed the ROCm extension, the same, from the webites' tarball.

As a front-end I use Continue on VS Codium, which is an opensource coding agent that works similarly to a GitHub copilot.

I'd say that it should be as straightforward to build as openwebui would be, as ollama accepts all the requests as HTTP requests on 11434 port. As long as your openwebui installation can see that port, the rest would be to download the models you'd like.

Are you stuck at any particular point, or are you still about to try?
I have it working although on a cpu only system and limited to 7b models. Once I have it going I will move it to better setup but my issue is the bubble wrap. I need to jail break the models so they stop the judgemental crap and do what I want. I also need to make them more accurate in responses. Stop the guessing and hallucinations and verify the answers. mostly it is technical stuff so accuracy is highly important. I prefer a response of "I don't Know" rather than a made up answer.
 
That's quite tricky, for the accuracy stuff you need to put a lot of context to narrow the answers.

Judgemental crap, models trying to befriend you, etc., it's done by context as well, but some providers put fancy names to these special baseline contexts. For example, copilot / anthropic call those "skills", and continue call them "rules". I am not sure / no clue how to specify this baseline contextual behaviour in openwebui.

You can skip the below if the coding scenario is not making any sense to you, but those are the examples I can give.

Example:

here are my "rules" for qwen3-coder in continue:

Code:
---
description: Succint
---
- Be succint
- Do not output unnecessary text
- Do not suggest any follow up actions after a response

This file above is very obvious, but it works very well.

And for a python project that I did to learn how to write good context files:
Code:
---
description: Project Structure, Naming Conventions and Tooling
---
## Goal
The project consists of a Python command line application.
This file defines the structure, naming conventions and tooling for test, build, package and manage the dependencies of it.
No functional descriptions are here, just the project's skeleton, rules and housekeeping.
The rules of this file must always be applied.

## Dependency Management and Packaging Toolset
- [Poetry](https://python-poetry.org/) will be used.

## Structure
As a result of using [Poetry](https://python-poetry.org/), this project is organised as follows:
```console
.
├── doc
│   ├── 1-ProblemStatement.md
│   ├── 2-SolutionIntent.md
│   ├── 3-Requirements.md
│   └── 4-Design.md
├── examples
│   ├── 1980-may-15-bday_v2.txt
│   ├── 20250723-filename.txt
│   ├── file_1.txt
│   ├── file-2.txt
│   └── March15_2023-recipe.md
├── src
│   ├── __init__.py
│   ├── llm_renamer.py
│   └── __main__.py
├── test
|   ├── func
|   └── unit
├── poetry.lock
├── pyproject.toml
├── LICENSE
└── README.md
```
- The [root](../../) folder is represented above by a period (`.`). It contain all the build scripts, configuration files, Python application and package description:
  - Poetry's [pyproject.toml](../../pyproject.toml) and [poetry.lock](../../poetry.lock) go here.
  - [.gitignore](../../.gitignore), [LICENSE](../../LICENSE) and [README.md](../../README.md) go here. 
- The [doc](../../doc/) folder contains all the functional documentation of the project.
  - This is not created by Poetry, but by the author of the project.
  - Not a comprehensive list: problem statement, solution intent, requirements, design are in this folder. 
- The [examples](../../examples/) directory contains examples of the problem to solve.
  - This is not created by Poetry, but by the author of the project.
  - The funcional test suite to be written in [test](../../test/) must use this files to ensure that the problem is solved and all test pass.
- The directory [src](../../src/) is for Python source code.
  - This is part of the structure Poetry expects. 
  - All the logic goes here. 
- The directory [test](../../test/) contains the source code of all the test cases. 
  - This is part of the stricture Poetry expects.
  - At a later point in time we'll get into naming conventions for telling apart unit from functional test cases

## Naming Conventions
- Use [PEP 8](https://peps.python.org/pep-0008/) for the code.
- Use [PEP 257](https://peps.python.org/pep-0257/) for the docstrings.

## Segregation of concerns
- The command-line arguments parsing and validation using [argparse](https://docs.python.org/3/library/argparse.html) are meant to be written in the `main()` function of the entrypoint Python source file.
  - This entrypoint is the same to be referenced by the section `[]` of `[pyproject.toml](../../pyproject.toml)`.
- The business logic, or what to do given the command-line arguments, must be written in a different function
  - One function per command-line argument or group of arguments that results in a valid invocation.
  - As a default behaviour for unrecognised options, there will be a distinct function that outputs an arror message, recommends checking the help options (`-h` or `--help`), and exits with `1`.

## Command-Line Interface
- Use this [GNU Coding Standards](https://www.gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html) document for defining the command line switches, options and arguments.
- Use the Python's [argparse](https://docs.python.org/3/library/argparse.html) module to implement them.

## Test Framework
- Use [unittest](https://docs.python.org/3/library/unittest.html) for unit testing.
  - Unit tests must only use mocks, using [unittest.mock](https://docs.python.org/3/library/unittest.mock.html). **Do not** use the examples under [examples](../../examples/) for unit testing.
- Use [PyTest](https://docs.pytest.org/en/latest/) for functional testing.
  - Use the examples under [examples](../../examples/) for funcional testing. 

## Portability and Isolation.
- A virtual environment defined using [Venv](https://docs.python.org/3/library/venv.html) will be used. 
- The virtual environment will be using the most recent Python 3 installation found in the `$PATH`.
- **Do not modify the system**'s Python environment. 
- Anything needed must be installed within the virtual environment, not in the system.
## License of the Project
[GNU Public License 3.0](https://www.gnu.org/licenses/gpl-3.0.en.html) will be the license of this project. 
To do so, the following text must be included at the top of every source file of the program, as a Python comment:
```txt
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year>  <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.
```
When including the above text as a Python comment: 
- Replace `<year>` with 2026: the present year.
- Replace `<name of author>` with the name specified in the `## Author` section of the [README.md](../../README.md) file, and 
- Replace `<one line to give the program's name and a brief idea of what it does.>` with the text of the `# Title` header and first sentence under the `# Title` header of the [README.md](../../README.md) file, all in one line.

This file above is kind of the "Contribution rules", and it's good because so gar, it has been followed to the letter. And as they are written as "rules", continue will apply them always for you as context for your prompts.

That way you can forget about those details in the routinary work and focus on what you want the model to do, which in a python project would be the logic of the module or program (calculate prime factors or whatever else purpose you'd be writing python for).

For example, my usual context files would be something like this.
Code:
# Requirements

## Functional Requirements
- The name of the tool will be `demo`.
- The tool will be a command line tool.
- The tool will accept an integer number as an input and will output the prime factors of that number. Then, it will tell whether or not is a prime number.
  - A number is prime if its prime factors are only `1` and itself.
  - The tool will output the prime factors of a number, in an abbreviated way.
    - Each prime factor must be printed **only once**.
    - One prime factor per line.
    - The prime factors will be ordered descending.
  - After all the prime factors, the tool will output a message stating if the number is prime or not.
  - If the task is completed successfully, the tool will exit with zero code --regardless the input is prime or not.
- If the input is not an integer, it will output an error message and exit with a non-zero exit code.

## Non-Functional Requirements
- The internal design must separate the logic from the user command-line interface.

After this, the prompts can be relatively simple, like "build a project with a command line tool following the requirements" and attaching the file "3-Requirements.md" as context.

I did a demo of this in a SLUG meeting and I was able to create a program to factor an integer in just one prompt like the above, connecting my laptop to my gaming PC at home through tailscale. I published the repo here
 
Another important thing is that I haven't connected my models in Ollama to the internet. I haven't looked at it.

This means that the models will only generate pseudo-random text based on their training and on the context you write. A practical implication of it is that, while they "obey" and use Poetry, they will use the poetry syntax and features that were live, and used, at the time their training data set was curated, in my case late 2024 for the above experiment.

So if you seek accuracy of general purpose tasks and use these models as a "search proxy", first thing you would have to do would be to look how to connect ollama models to the web, or more accurately, how to make your conversation agent (openwebui) to fetch text from the internet and feed it as context to the models.
 
I have decided I am fed up with current public AI and making my own. I am using docker and ollama with openwebui. I am hoping somebody in here has already run into this and can direct me. I would like to find a helpful group much like this forum that is dedicated to the AI like I described. I want to avoid the toxic environments like reddit and discord. Anybody that has suggestions please post.
I think you might be interested to take a look at Pewdiepie’s “Odysseus
 
I already have ollama connected to the web. And it does work but let me give some silly examples...
if I say I want to see naked pictures of cousin it, I don't want it to come back with judgements about it and say it can't do it, it should either say "cousin it is nothing but hair, there is nothing to be naked, or just say it does not exist"
if I say I want a coded subroutine that allows me to use android to put the message "Left turn Clyde" on the car display, it should do just that without making up commands that do not exist.

I think the above rules will help but it does not jailbreak the model. it will still refuse to show me blank pictures that would be cousin it naked.

ollama came as a recommended platform but probably from what I read in the above article it was marketing like M$. I am not stuck on ollama so if there is a better thing to use I just have to get good install instructions. I am somewhat new to the AI in this respect. But I need something with accuracy and no moral judgements.
 
I already have ollama connected to the web. And it does work but let me give some silly examples...
if I say I want to see naked pictures of cousin it, I don't want it to come back with judgements about it and say it can't do it, it should either say "cousin it is nothing but hair, there is nothing to be naked, or just say it does not exist"
if I say I want a coded subroutine that allows me to use android to put the message "Left turn Clyde" on the car display, it should do just that without making up commands that do not exist.

I think the above rules will help but it does not jailbreak the model. it will still refuse to show me blank pictures that would be cousin it naked.

ollama came as a recommended platform but probably from what I read in the above article it was marketing like M$. I am not stuck on ollama so if there is a better thing to use I just have to get good install instructions. I am somewhat new to the AI in this respect. But I need something with accuracy and no moral judgements.

I don't know what is "cousin it", but... ollama is just a model manager, it does not generate answers, but just passes them along.
It's the LLM which generate the reply.
Change model to some hacked uncensored one if that's your thing.

That being said, consider dropping ollama because of its shady practices (#7), and move to llama.cpp or LM Studio (based on llama.cpp).
 
I don't know what is "cousin it", but... ollama is just a model manager, it does not generate answers, but just passes them along.
It's the LLM which generate the reply.
Change model to some hacked uncensored one if that's your thing.

That being said, consider dropping ollama because of its shady practices (#7), and move to llama.cpp or LM Studio (based on llama.cpp).
cousin it is from the show the Addams Family. He is a moving hair ball basically. Since you have no idea what it is that is why the humor was lost on you. I have no issue moving to llama.cpp but just have to get it set up and yes it is the models that I need to jailbreak. The uncensored ones seem to be very censored still.
 
cousin it is from the show the Addams Family. He is a moving hair ball basically. Since you have no idea what it is that is why the humor was lost on you. I have no issue moving to llama.cpp but just have to get it set up and yes it is the models that I need to jailbreak. The uncensored ones seem to be very censored still.

1780525260160.png


So that's Cousin Itt
Yeah, sorry, Addams Family was never my thing back when I was in target audience age ;)

In llama.cpp you can load models directly from Hugging Face, with multi-modal files, in GGUF format.
In ollama, you are at a mercy of what ollama team prepared for you already.

I suggest trying llama.cpp for direct access to many more models out there, including uncensored ones.
 
Last edited:
Been running it for a few months now. A heavy duty GPU is highly recommended.
I understand that. But before I dump lots of money on something like this I want to make sure it can be configured to work as desired. So I will wait 15 minutes for a response at this point as long as it is correct and to my liking. Once I know it is ok I can move the drive to a newer more powerful system.
 
Every AI has moderation built-in, pretty sure you can't have it without moderation.
100% accuracy is also not yet possible.
Indeed, and furthermore, LLMs are designed to be creative, meaning 100% accuracy and reproducibility is not prioritized.
 
all nice they can be creative but when you are working with technical stuff, creativity is not a good thing. You can't go making things up when it comes to technical stuff. That is called an error. Main reason I want to do my own where accuracy is the priority and moral judgements are not enforced.
 
Every AI has moderation built-in, pretty sure you can't have it without moderation.
100% accuracy is also not yet possible.

Probably true, but the degree of moderation is highly variable.
Some won't let you discuss some things at all. Others will let you discuss them up to t a point.
 


Follow Linux.org

Staff online


Latest posts

Top