common_init_from_params: setting dry_penalty_last_n to ctx_size = 4096
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
ggml_cuda_compute_forward: ADD failed
CUDA error: the provided PTX was compiled with an unsupported toolchain.
current device: 0, in function ggml_cuda_compute_forward at D:\a\llama.cpp\llama.cpp\ggml\src\ggml-cuda\ggml-cuda.cu:2230
err
D:\a\llama.cpp\llama.cpp\ggml\src\ggml-cuda\ggml-cuda.cu:71: CUDA error
sh: ./llama-server.exe: このアプリケーションで、スタック ベースのバッファーのオーバーランが検出されました。このオーバーラン により、悪質なユーザーがこのアプリケーションを制御できるようになる可能性があります。 Error 0xc0000409
2,000 code suggestions a month: Get context-aware suggestions tailored to your VS Code workspace and GitHub projects.
50 Copilot Chat messages a month: Use Copilot Chat in VS Code and on GitHub to ask questions and refactor, debug, document, and explain code.
Choose your AI model: You can select between Anthropic’s Claude 3.5 Sonnet or OpenAI’s GPT 4o.
Render edits across multiple files: Use Copilot Edits to make changes to multiple files you’re working with.
Access the Copilot Extensions ecosystem: Use third-party agents to conduct web searches via Perplexity, access information from Stack Overflow, and more.
ModuleNotFoundError: No module named 'pydantic_core._pydantic_core'
Traceback:
File "C:\Dev\llm\softwaredesigne\softwaredesign-llm-application\14\.venv\Lib\site-packages\streamlit\runtime\scriptrunner\exec_code.py", line 88, in exec_func_with_error_handling
result = func()
^^^^^^
File "C:\Dev\llm\softwaredesigne\softwaredesign-llm-application\14\.venv\Lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 590, in code_to_exec
exec(code, module.__dict__)
File "C:\Dev\llm\softwaredesigne\softwaredesign-llm-application\14\app.py", line 5, in <module>
from agent import HumanInTheLoopAgent
...
C:\Dev\llm\softwaredesigne\softwaredesign-llm-application\14> pip install fastapi==0.99.0
...
Using cached starlette-0.27.0-py3-none-any.whl (66 kB)
Installing collected packages: pydantic, starlette, fastapi
Attempting uninstall: pydantic
Found existing installation: pydantic 2.9.2
Uninstalling pydantic-2.9.2:
Successfully uninstalled pydantic-2.9.2
Attempting uninstall: starlette
Found existing installation: starlette 0.41.2
Uninstalling starlette-0.41.2:
Successfully uninstalled starlette-0.41.2
Attempting uninstall: fastapi
Found existing installation: fastapi 0.115.4
Uninstalling fastapi-0.115.4:
Successfully uninstalled fastapi-0.115.4
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
composio-core 0.5.40 requires pydantic<3,>=2.6.4, but you have pydantic 1.10.19 which is incompatible.
Successfully installed fastapi-0.99.0 pydantic-1.10.19 starlette-0.27.0
T4 GPUs include Tensor Cores, which can accelerate certain deep learning operations, particularly when working with mixed-precision training
GPU P100
NVIDIA GPU model from the Tesla series.
Pascal
3,584 CUDA cores
16 GB or 12 GB of HBM2 memory.
In general, GPU P100 is considered more powerful than GPU T4, primarily due to the higher number of CUDA cores and better memory bandwidth. However, the actual performance depends on the specific workload you’re running.
GPU Model: T4 is an NVIDIA GPU model from the Tesla series. Architecture: T4 is based on the Turing architecture. CUDA Cores: T4 has 2,560 CUDA cores. Memory: T4 typically has 16 GB of GDDR6 memory. Performance: T4 offers good performance for a range of tasks, including machine learning, deep learning, and general GPU-accelerated computations. Tensor Cores: T4 GPUs include Tensor Cores, which can accelerate certain deep learning operations, particularly when working with mixed-precision training.
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacty of 4.00 GiB of which 0 bytes is free. Of the allocated memory 3.35 GiB is allocated by PyTorch, and 144.71 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Model load has failed. Doesn't exist.
GPUのメモリが足りない・・・? 確認してみる
c:\Dev\StreamDiffusion>nvidia-smi
Wed Jan 17 17:13:45 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 522.06 Driver Version: 522.06 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... WDDM | 00000000:01:00.0 Off | N/A |
| N/A 58C P8 9W / N/A | 114MiB / 4096MiB | 22% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 5588 C+G ...ser\Application\brave.exe N/A |
| 0 N/A N/A 7000 C+G ...n64\EpicGamesLauncher.exe N/A |
+-----------------------------------------------------------------------------+
今はありそうな気もするけど実行時に足りなくなってしまったということなのかな。
そして結局
Found cached model: engines\KBlueLeaf/kohaku-v2.1--lcm_lora-True--tiny_vae-True--max_batch-2--min_batch-2--mode-img2img\unet.engine.onnx
Generating optimizing model: engines\KBlueLeaf/kohaku-v2.1--lcm_lora-True--tiny_vae-True--max_batch-2--min_batch-2--mode-img2img\unet.engine.opt.onnx
[W] Model does not contain ONNX domain opset information! Using default opset.
UNet: original .. 0 nodes, 0 tensors, 0 inputs, 0 outputs
UNet: cleanup .. 0 nodes, 0 tensors, 0 inputs, 0 outputs
[I] Folding Constants | Pass 1
[W] Model does not contain ONNX domain opset information! Using default opset.
Only support models of onnx opset 7 and above.
Traceback (most recent call last):
File "C:\Dev\StreamDiffusion\examples\screen\..\..\utils\wrapper.py", line 546, in _load_model
compile_unet(
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\streamdiffusion\acceleration\tensorrt\__init__.py", line 76, in compile_unet
builder.build(
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\streamdiffusion\acceleration\tensorrt\builder.py", line 70, in build
optimize_onnx(
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\streamdiffusion\acceleration\tensorrt\utilities.py", line 437, in optimize_onnx
onnx_opt_graph = model_data.optimize(onnx.load(onnx_path))
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\streamdiffusion\acceleration\tensorrt\models.py", line 118, in optimize
opt.fold_constants()
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\streamdiffusion\acceleration\tensorrt\models.py", line 49, in fold_constants
onnx_graph = fold_constants(gs.export_onnx(self.graph), allow_onnxruntime_shape_inference=True)
File "<string>", line 3, in fold_constants
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\polygraphy\backend\base\loader.py", line 40, in __call__
return self.call_impl(*args, **kwargs)
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\polygraphy\util\util.py", line 694, in wrapped
return func(*args, **kwargs)
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\polygraphy\backend\onnx\loader.py", line 424, in call_impl
postfold_num_nodes = onnx_util.get_num_nodes(model)
File "C:\Dev\StreamDiffusion\.venv\lib\site-packages\polygraphy\backend\onnx\util.py", line 41, in get_num_nodes
return _get_num_graph_nodes(model.graph)
AttributeError: 'NoneType' object has no attribute 'graph'
Acceleration has failed. Falling back to normal mode.
>pip3 install torch==2.1.0 torchvision==0.16.0 xformers --index-url https://download.pytorch.org/whl/cu118
ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch
Error invoking remote method ‘docker-start-container’: Error: (HTTP code 500) server error – Ports are not available: exposing port TCP 127.0.0.1:5432 -> 0.0.0.0:0: listen tcp 127.0.0.1:5432: bind: An attempt was made to access a socket in a way forbidden by its access permissions.