BiRefNet之后是否还会有优化？

#19

by Tyxxx - opened 18 days ago

Discussion

Tyxxx

18 days ago

BiRefNet是我见到的最好的抠图模型，其中HRSOD版本是电商白底抠图表现最好的，但是对于个别复杂背景的图片，白底似乎还有提升空间

ZhengPeng7

Owner 18 days ago

多谢哈, 如果条件允许, 你可以开启config.py中的background_color_synthesis, 然后进行纯白色背景的data synthesis, 效果应该对于这样的场景能有所提升. HRSOD好的原因也是可能数据集特别契合你这里的case.

Tyxxx

17 days ago

好的，我试试看

fa-qwen

2 days ago

•

edited 2 days ago

抱歉，我刚注册，不知道怎么马上回复您的消息，第一张图片就是我的原始图片，第二张图片就是用您的模型推理出来的。
下面的代码是完整的代码。
代码是参考这个库改的
https://huggingface.co/spaces/not-lain/background-removal

zheng peng 先生您好呀，非常感谢您完成了这么好的作品，我也用了您的库去做家具产品的批量抠图，想求问一下导致这个产品抠图失败的原因可能有哪些呢？

import os
from tqdm import tqdm
from loadimg import load_img
from transformers import AutoModelForImageSegmentation
import torch
from torchvision import transforms
from PIL import Image

Setup device

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

torch.set_float32_matmul_precision(["high", "highest"][0])

Load model from local directory

print("Loading model from local 'model' directory...")
try:
birefnet = AutoModelForImageSegmentation.from_pretrained(
"model", trust_remote_code=True
)
birefnet.to(device)
print("Model loaded successfully.")
except Exception as e:
print(f"Error loading model: {e}")
print("Please ensure the 'model' directory exists and contains the model files.")
exit(1)

transform_image = transforms.Compose(
[
transforms.Resize((1024, 1024)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
]
)

def process(image: Image.Image) -> Image.Image:
"""
Apply BiRefNet-based image segmentation to remove the background.
"""
image_size = image.size
input_images = transform_image(image).unsqueeze(0).to(device)
# Prediction
with torch.no_grad():
preds = birefnet(input_images)[-1].sigmoid().cpu()
pred = preds[0].squeeze()
pred_pil = transforms.ToPILImage()(pred)
mask = pred_pil.resize(image_size)
image.putalpha(mask)
return image

if name == "main":
input_dir = r"H:\product_images\resize_800"
output_dir = r"H:\product_images\birefnet_800"

if not os.path.exists(input_dir):
    print(f"Error: Input directory not found: {input_dir}")
    exit(1)
    
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
    print(f"Created output directory: {output_dir}")
    
# Get list of image files
valid_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
image_files = [f for f in os.listdir(input_dir) if os.path.splitext(f)[1].lower() in valid_extensions]

print(f"Found {len(image_files)} images to process in {input_dir}")

for filename in tqdm(image_files, desc="Processing images"):
    input_path = os.path.join(input_dir, filename)
    # Construct output filename with .png extension (required for transparency)
    name_part = os.path.splitext(filename)[0]
    output_filename = f"{name_part}.png"
    output_path = os.path.join(output_dir, output_filename)
    
    try:
        im = load_img(input_path, output_type="pil")
        im = im.convert("RGB")
        processed = process(im)
        processed.save(output_path)
    except Exception as e:
        print(f"Error processing {filename}: {e}")

print("Batch processing complete.")

ZhengPeng7

Owner 2 days ago

你这里的原图是什么呢? 我有点懵

ZhengPeng7

Owner 2 days ago

确实很诡异, 我怀疑是图像有压缩. 但我也是第一次看到这种问题.
我看到你的邮件啦, 我可能得慢慢看看才弄明白, 有结果会回复你的哈, 这里和邮件.

ZhengPeng7

Owner 1 day ago

•

edited 1 day ago

这个是原图为RGBA的问题, 预期之外的区域的alpha图里是全透明, 所以图像查看器里看不出来, 但直接截取RGB通道的话就是像上面这样.
因此, 我已经在HF上的online demo里加了rgba2rgb的处理并设置rgba为默认格式: https://huggingface.co/spaces/ZhengPeng7/BiRefNet_demo/blob/main/app.py#L256
于是, 这个问题就算是得到了解决.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment