Available ComfyUI Nodes
This guide covers the available nodes and requirements for creating real-time video pipelines using ComfyUI with Livepeer
Required Video Input/Output Nodes
These nodes are required for creating real-time video pipelines
- Input:
- Video stream URL or device ID
- Optional configuration parameters
- Output:
- RGB frame tensor (3, H, W)
- Frame metadata (timestamp, index)
- Performance Requirements:
- Frame processing time: < 5ms
- VRAM usage: < 500MB
- Buffer size: ≤ 2 frames
- Supported formats: RTMP, WebRTC, V4L2
- Best Practices:
- Set fixed frame rate
Inference Nodes
Nodes for analyzing video frames in real-time. They can be used for tasks like object detection, segmentation, and depth estimation
- Input: RGB frame (3, H, W)
- Output: Depth map (1, H, W)
- Performance Requirements:
- Inference time: < 20ms
- VRAM usage: 2GB
- Batch size: 1
- Best Practices:
- Place early in workflow
- Cache results for static scenes
- Use lowest viable resolution
- Input: RGB frame (3, H, W)
- Output: Depth map (1, H, W)
- Performance Requirements:
- Inference time: < 20ms
- VRAM usage: 2GB
- Batch size: 1
- Best Practices:
- Place early in workflow
- Cache results for static scenes
- Use lowest viable resolution
- Navigate to the
custom_nodes
directory in your ComfyUI workspace
cd /workspace/ComfyUI/custom_nodes
- Clone the repository
git clone https://github.com/yuvraj108c/ComfyUI-Depth-Anything-Tensorrt.git
cd ./ComfyUI-Depth-Anything-Tensorrt
- Install the node
conda activate comfyui
pip install -r requirements.txt
- Download the TensorRT onnx file and build the engine
wget -O depth_anything_vitl14.onnx https://huggingface.co/yuvraj108c/Depth-Anything-2-Onnx/resolve/main/depth_anything_v2_vitb.onnx?download=true --content-disposition
python export_trt.py
- Copy the TensorRT engine file to the ComfyUI models directory
mkdir -p /workspace/ComfyUImodels/tensorrt/depth-anything/
mv depth_anything_vitl14-fp16.engine /workspace/ComfyUImodels/tensorrt/depth-anything/
{
"1": {
"inputs": {
"images": [
"2",
0
]
},
"class_type": "SaveTensor",
"_meta": {
"title": "SaveTensor"
}
},
"2": {
"inputs": {
"engine": "depth_anything_vitl14-fp16.engine",
"images": [
"3",
0
]
},
"class_type": "DepthAnythingTensorrt",
"_meta": {
"title": "Depth Anything Tensorrt"
}
},
"3": {
"inputs": {},
"class_type": "LoadTensor",
"_meta": {
"title": "LoadTensor"
}
}
}
Useful for tracking an object given positive and negative prompt coordinates. It can be combined with Florence2 and other object detection nodes.
- Input: RGB frame (3, H, W)
- Output: Segmentation mask (1, H, W)
- Performance Requirements:
- Inference time: < 30ms
- VRAM usage: 3GB
- Batch size: 1
- Best Practices:
- Cache static masks
- Use mask erosion for stability
- Implement confidence thresholding
Useful for tracking an object given positive and negative prompt coordinates. It can be combined with Florence2 and other object detection nodes.
- Input: RGB frame (3, H, W)
- Output: Segmentation mask (1, H, W)
- Performance Requirements:
- Inference time: < 30ms
- VRAM usage: 3GB
- Batch size: 1
- Best Practices:
- Cache static masks
- Use mask erosion for stability
- Implement confidence thresholding
- Navigate to the
custom_nodes
directory in your ComfyUI workspace
cd /workspace/ComfyUI/custom_nodes
- Clone the repository
git clone https://github.com/pschroedl/ComfyUI-SAM2-Realtime.git
cd ComfyUI-SAM2-Realtime
- Install requirements
conda activate comfyui
pip install -r requirements.txt
For Windows, ensure the prerequisites are installed by following the Windows section in Install Nodes in ComfyUI
- Input: RGB frame (3, H, W)
- Output: Feature vector (1, 512)
- Performance Requirements:
- Inference time: < 15ms
- VRAM usage: 1GB
- Batch size: 1
- Best Practices:
- Cache embeddings for known references
- Use cosine similarity for matching
- Implement feature vector normalization
- Input: RGB frame (3, H, W)
- Output: Feature vector (1, 512)
- Performance Requirements:
- Inference time: < 15ms
- VRAM usage: 1GB
- Batch size: 1
- Best Practices:
- Cache embeddings for known references
- Use cosine similarity for matching
- Implement feature vector normalization
- Navigate to the
custom_nodes
directory in your ComfyUI workspace
cd /workspace/ComfyUI/custom_nodes
- Clone the repository
git clone https://github.com/ad-astra-video/ComfyUI-Florence2-Vision.git
cd ComfyUI-Florence2-Vision
git checkout 0c624e61b6606801751bd41d93a09abe9844bea7
- Install requirements
conda activate comfyui
pip install -r requirements.txt
Download example workflows here
Generative and Control Nodes
Nodes for applying visual effects and controlling video content in real-time
A suite of nodes for real-time video processing and control. Some examples of nodes included are:
- Value: FloatControl, IntControl and StringControl
- Sequence: FloatSequence, IntSequence and StringSequence
- Motion: MotionController, IntegerMotionController, ROINode
- Utility: FPSMonitor, QuickShapeMask, DTypeConverter, FastWebcamCapture
- VAE: TAESDVaeEncodem, TAESDVaeDecode
Refer to the README.MD for more details
A suite of nodes for real-time video processing and control. Some examples of nodes included are:
- Value: FloatControl, IntControl and StringControl
- Sequence: FloatSequence, IntSequence and StringSequence
- Motion: MotionController, IntegerMotionController, ROINode
- Utility: FPSMonitor, QuickShapeMask, DTypeConverter, FastWebcamCapture
- VAE: TAESDVaeEncodem, TAESDVaeDecode
Refer to the README.MD for more details
- Navigate to the
custom_nodes
directory in your ComfyUI workspace
cd /workspace/ComfyUI/custom_nodes
- Clone the repository and install requirements
git clone https://github.com/ryanontheinside/ComfyUI_RealTimeNodes
cd ComfyUI_RealTimeNodes
pip install -r requirements.txt
Download example workflows here
This node is useful for styling videos with a diffusion effect. It can be combined with other nodes to create masked effects.
- Input: RGB frame (3, H, W)
- Output: Feature vector (1, 512)
This node is useful for styling videos with a diffusion effect. It can be combined with other nodes to create masked effects.
- Input: RGB frame (3, H, W)
- Output: Feature vector (1, 512)
- Navigate to the
custom_nodes
directory in your ComfyUI workspace
cd /workspace/ComfyUI/custom_nodes
- Clone the repository
git clone https://github.com/pschroedl/ComfyUI-StreamDiffusion.git
cd ComfyUI-StreamDiffusion
git checkout 628c9f81abdc4f9b0207bcb556f35a8cbc2d7319
- Install requirements
conda activate comfyui
pip install -r requirements.txt
For Windows, ensure the prerequisites are installed by following the Windows section in Install Nodes in ComfyUI
Download example workflows here
The NVIDIA TensorRT plugin optimizes stable diffusion performance by generating a static TensorRT engine based on static StableDiffusion parameters.
The NVIDIA TensorRT plugin optimizes stable diffusion performance by generating a static TensorRT engine based on static StableDiffusion parameters.
Refer to the README for install and usage instructions.
- Input:
- Source image (3, H, W)
- Driving frame (3, H, W)
- Output: Animated frame (3, H, W)
- Performance Requirements:
- Inference time: < 50ms
- VRAM usage: 4GB
- Batch size: 1
- Best Practices:
- Pre-process source images
- Implement motion smoothing
- Cache facial landmarks
- Input:
- Source image (3, H, W)
- Driving frame (3, H, W)
- Output: Animated frame (3, H, W)
- Performance Requirements:
- Inference time: < 50ms
- VRAM usage: 4GB
- Batch size: 1
- Best Practices:
- Pre-process source images
- Implement motion smoothing
- Cache facial landmarks
- Navigate to the
custom_nodes
directory in your ComfyUI workspace
cd /workspace/ComfyUI/custom_nodes
- Clone the repository
git clone https://github.com/kijai/ComfyUI-LivePortraitKJ.git
cd ComfyUI-LivePortraitKJ
- Install requirements
conda activate comfyui
pip install -r requirements.txt
- Input:
- Conditioning tensor
- Latent tensor
- Output: Generated frame (3, H, W)
- Performance Requirements:
- Inference time: < 50ms
- VRAM usage: 4GB
- Maximum steps: 20
- Best Practices:
- Use TensorRT optimization
- Implement denoising strength control
- Cache conditioning tensors
- Input:
- Control signal
- Target tensor
- Output: Controlled tensor
- Performance Requirements:
- Inference time: < 30ms
- VRAM usage: 2GB
- Resolution: ≤ 512
- Best Practices:
- Use adaptive conditioning
- Implement strength scheduling
- Cache control signals
Supporting Nodes
These nodes are used to provide inputs, prompts and other supporting functions for the video pipeline
- Input:
- Latent tensor
- Conditioning
- Output: Sampled latent
- Performance Requirements:
- Maximum steps: 20
- VRAM usage: 2GB
- Scheduler: euler_ancestral
- Best Practices:
- Use adaptive step sizing
- Cache conditioning tensors
- Input: Text prompts
- Output: Conditioning tensors
- Performance Requirements:
- Processing time: < 5ms
- VRAM usage: minimal
- Best Practices:
- Cache common prompts
- Use consistent style tokens
- Implement prompt weighting
- Input: Latent tensor
- Output: RGB frame
- Performance Requirements:
- Inference time: < 10ms
- VRAM usage: 1GB
- Tile size: 512
- Best Practices:
- Use tiling for large frames
- Implement half-precision
- Cache common latents
- Input:
- Reference image
- Target tensor
- Output: Conditioned tensor
- Performance Requirements:
- Inference time: < 20ms
- VRAM usage: 2GB
- Reference resolution: ≤ 512x512
- Best Practices:
- Cache reference embeddings
- Use consistent weights
- Implement cross-attention
- Input: Any tensor
- Output: Cached tensor
- Performance Requirements:
- Access time: < 1ms
- Maximum size: 2GB
- Cache type: GPU
- Best Practices:
- Implement LRU eviction
- Monitor cache pressure
- Clear on scene changes
- Input:
- Control signal
- Target tensor
- Output: Controlled tensor
- Performance Requirements:
- Inference time: < 30ms
- VRAM usage: 2GB
- Resolution: ≤ 512
- Best Practices:
- Use adaptive conditioning
- Implement strength scheduling
- Cache control signals
Default Nodes
All default nodes that ship with ComfyUI are available. The list below is subject to change.
- AddNoise
- AlignYourStepsScheduler
- BasicGuider
- BasicScheduler
- BetaSamplingScheduler
- CFGGuider
- CLIPAttentionMultiply
- CLIPLoader
- CLIPMergeAdd
- CLIPMergeSimple
- CLIPMergeSubtract
- CLIPSave
- CLIPSetLastLayer
- CLIPTextEncode
- CLIPTextEncodeControlnet
- CLIPTextEncodeFlux
- CLIPTextEncodeHunyuanDiT
- CLIPTextEncodeSD3
- CLIPTextEncodeSDXL
- CLIPTextEncodeSDXLRefiner
- CLIPVisionEncode
- CLIPVisionLoader
- Canny
- CheckpointLoader
- CheckpointLoaderSimple
- CheckpointSave
- ConditioningAverage
- ConditioningCombine
- ConditioningConcat
- ConditioningSetArea
- ConditioningSetAreaPercentage
- ConditioningSetAreaStrength
- ConditioningSetMask
- ConditioningSetTimestepRange
- ConditioningZeroOut
- ControlNetApply
- ControlNetApplyAdvanced
- ControlNetApplySD3
- ControlNetInpaintingAliMamaApply
- ControlNetLoader
- CropMask
- DiffControlNetLoader
- DifferentialDiffusion
- DiffusersLoader
- DisableNoise
- DualCFGGuider
- DualCLIPLoader
- EmptyHunyuanLatentVideo
- EmptyImage
- EmptyLTXVLatentVideo
- EmptyLatentAudio
- EmptyLatentImage
- EmptyMochiLatentVideo
- EmptySD3LatentImage
- ExponentialScheduler
- FeatherMask
- FlipSigmas
- FluxGuidance
- FreeU
- FreeU_V2
- GITSScheduler
- GLIGENLoader
- GLIGENTextBoxApply
- GrowMask
- HyperTile
- HypernetworkLoader
- ImageBatch
- ImageBlend
- ImageBlur
- ImageColorToMask
- ImageCompositeMasked
- ImageCrop
- ImageFromBatch
- ImageInvert
- ImageOnlyCheckpointLoader
- ImageOnlyCheckpointSave
- ImagePadForOutpaint
- ImageQuantize
- ImageScale
- ImageScaleBy
- ImageScaleToTotalPixels
- ImageSharpen
- ImageToMask
- ImageUpscaleWithModel
- InpaintModelConditioning
- InstructPixToPixConditioning
- InvertMask
- JoinImageWithAlpha
- KSampler
- KSamplerAdvanced
- KSamplerSelect
- KarrasScheduler
- LTXVConditioning
- LTXVImgToVideo
- LTXVScheduler
- LaplaceScheduler
- LatentAdd
- LatentApplyOperation
- LatentApplyOperationCFG
- LatentBatch
- LatentBatchSeedBehavior
- LatentBlend
- LatentComposite
- LatentCompositeMasked
- LatentCrop
- LatentFlip
- LatentFromBatch
- LatentInterpolate
- LatentMultiply
- LatentOperationSharpen
- LatentOperationTonemapReinhard
- LatentRotate
- LatentSubtract
- LatentUpscale
- LatentUpscaleBy
- Load3D
- Load3DAnimation
- LoadAudio
- LoadImage
- LoadImageMask
- LoadLatent
- LoraLoader
- LoraLoaderModelOnly
- LoraSave
- Mahiro
- MaskComposite
- MaskToImage
- ModelMergeAdd
- ModelMergeAuraflow
- ModelMergeBlocks
- ModelMergeFlux1
- ModelMergeLTXV
- ModelMergeMochiPreview
- ModelMergeSD1
- ModelMergeSD2
- ModelMergeSD35_Large
- ModelMergeSD3_2B
- ModelMergeSDXL
- ModelMergeSimple
- ModelMergeSubtract
- ModelSamplingAuraFlow
- ModelSamplingContinuousEDM
- ModelSamplingContinuousV
- ModelSamplingDiscrete
- ModelSamplingFlux
- ModelSamplingLTXV
- ModelSamplingSD3
- ModelSamplingStableCascade
- ModelSave
- Morphology
- PatchModelAddDownscale
- PerpNeg
- PerpNegGuider
- PerturbedAttentionGuidance
- PhotoMakerEncode
- PhotoMakerLoader
- PolyexponentialScheduler
- PorterDuffImageComposite
- PreviewAudio
- PreviewImage
- RandomNoise
- RebatchImages
- RebatchLatents
- RepeatImageBatch
- RepeatLatentBatch
- RescaleCFG
- SDTurboScheduler
- SD_4XUpscale_Conditioning
- SV3D_Conditioning
- SVD_img2vid_Conditioning
- SamplerCustom
- SamplerCustomAdvanced
- SamplerDPMAdaptative
- SamplerDPMPP_2M_SDE
- SamplerDPMPP_2S_Ancestral
- SamplerDPMPP_3M_SDE
- SamplerDPMPP_SDE
- SamplerEulerAncestral
- SamplerEulerAncestralCFGPP
- SamplerEulerCFGpp
- SamplerLCMUpscale
- SamplerLMS
- SaveAnimatedPNG
- SaveAnimatedWEBP
- SaveAudio
- SaveImage
- SaveImageWebsocket
- SaveLatent
- SelfAttentionGuidance
- SetLatentNoiseMask
- SetUnionControlNetType
- SkipLayerGuidanceDiT
- SkipLayerGuidanceSD3
- SolidMask
- SplitImageWithAlpha
- SplitSigmas
- SplitSigmasDenoise
- StableCascade_EmptyLatentImage
- StableCascade_StageB_Conditioning
- StableCascade_StageC_VAEEncode
- StableCascade_SuperResolutionControlnet
- StableZero123_Conditioning
- StableZero123_Conditioning_Batched
- StubConstantImage
- StubFloat
- StubImage
- StubInt
- StubMask
- StyleModelApply
- StyleModelLoader
- TestAccumulateNode
- TestAccumulationGetItemNode
- TestAccumulationGetLengthNode
- TestAccumulationHeadNode
- TestAccumulationSetItemNode
- TestAccumulationTailNode
- TestAccumulationToListNode
- TestBoolOperationNode
- TestCustomIsChanged
- TestCustomValidation1
- TestCustomValidation2
- TestCustomValidation3
- TestCustomValidation4
- TestCustomValidation5
- TestDynamicDependencyCycle
- TestExecutionBlocker
- TestFloatConditions
- TestForLoopClose
- TestForLoopOpen
- TestIntConditions
- TestIntMathOperation
- TestIsChangedWithConstants
- TestLazyMixImages
- TestListToAccumulationNode
- TestMakeListNode
- TestMixedExpansionReturns
- TestStringConditions
- TestToBoolNode
- TestVariadicAverage
- TestWhileLoopClose
- TestWhileLoopOpen
- ThresholdMask
- TomePatchModel
- TorchCompileModel
- TripleCLIPLoader
- UNETLoader
- UNetCrossAttentionMultiply
- UNetSelfAttentionMultiply
- UNetTemporalAttentionMultiply
- UpscaleModelLoader
- VAEDecode
- VAEDecodeAudio
- VAEDecodeTiled
- VAEEncode
- VAEEncodeAudio
- VAEEncodeForInpaint
- VAEEncodeTiled
- VAELoader
- VAESave
- VPScheduler
- VideoLinearCFGGuidance
- VideoTriangleCFGGuidance
- WebcamCapture
- unCLIPCheckpointLoader
- unCLIPConditioning